Provably Efficient Offline Reinforcement Learning in Regular Decision Processes

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Documents

Fulltext
Final published version, 645 KB, PDF document

Roberto Cipollone
Anders Jonsson
Alessandro Ronca
Talebi, Sadegh

This paper deals with offline (or batch) Reinforcement Learning (RL) in episodic Regular Decision Processes (RDPs). RDPs are the subclass of Non-Markov Decision Processes where the dependency on the history of past events can be captured by a finite-state automaton. We consider a setting where the automaton that underlies the RDP is unknown, and a learner strives to learn a near-optimal policy using pre-collected data, in the form of non-Markov sequences of observations, without further exploration. We present RegORL, an algorithm that suitably combines automata learning techniques and state-of-the-art algorithms for offline RL in MDPs. RegORL has a modular design allowing one to use any off-the-shelf offline RL algorithm in MDPs. We report a non-asymptotic high-probability sample complexity bound for RegORL to yield an ε-optimal policy, which makes appear a notion of concentrability relevant for RDPs. Furthermore, we present a sample complexity lower bound for offline RL in RDPs. To our best knowledge, this is the first work presenting a provably efficient algorithm for offline learning in RDPs.

Original language	English
Title of host publication	Advances in Neural Information Processing Systems 36 (NeurIPS 2023)
Number of pages	34
Publisher	NeurIPS Proceedings
Publication date	2023
Publication status	Published - 2023
Event	37th Conference on Neural Information Processing Systems - NeurIPS 2023 - New Orleans., United States Duration: 10 Dec 2023 → 16 Dec 2023

Conference

Conference	37th Conference on Neural Information Processing Systems - NeurIPS 2023
Land	United States
By	New Orleans.
Periode	10/12/2023 → 16/12/2023

Series	Advances in Neural Information Processing Systems
Volume	36
ISSN	1049-5258

Department of Computer Science