Saxena, N and Khastagir, S and Kolathaya, S and Bhatnagar, S (2023) Off-Policy Average Reward Actor-Critic with Deterministic Policy Search. In: Proceedings of Machine Learning Research, 23 - 29 July 2023, Honolulu, pp. 30130-30203.
PDF
ICML2023_202_30130-30203_2023.pdf - Published Version Restricted to Registered users only Download (4MB) | Request a copy |
Abstract
The average reward criterion is relatively less studied as most existing works in the Reinforcement Learning literature consider the discounted reward criterion. There are few recent works that present on-policy average reward actor-critic algorithms, but average reward off-policy actor-critic is relatively less explored. In this work, we present both on-policy and off-policy deterministic policy gradient theorems for the average reward performance criterion. Using these theorems, we also present an Average Reward Off-Policy Deep Deterministic Policy Gradient (ARO-DDPG) Algorithm. We first show asymptotic convergence analysis using the ODE-based method. Subsequently, we provide a finite time analysis of the resulting stochastic approximation scheme with linear function approximator and obtain an ϵ-optimal stationary policy with a sample complexity of Ω(ϵ−2.5). We compare the average reward performance of our proposed ARO-DDPG algorithm and observe better empirical performance compared to state-of-the-art on-policy average reward actor-critic algorithms over MuJoCo-based environments. © 2023 Proceedings of Machine Learning Research. All rights reserved.
Item Type: | Conference Paper |
---|---|
Publication: | Proceedings of Machine Learning Research |
Publisher: | ML Research Press |
Additional Information: | The copyright for this article belongs to the ML Research Press |
Keywords: | Approximation theory; Reinforcement learning, Actor critic; Actor-critic algorithm; Average reward; Average reward criteria; Deterministics; Discounted reward; Gradient algorithm; Policy gradient; Policy search; Reinforcement learnings, Stochastic systems |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation Division of Interdisciplinary Sciences > Robert Bosch Centre for Cyber Physical Systems |
Date Deposited: | 17 Dec 2023 10:03 |
Last Modified: | 17 Dec 2023 10:03 |
URI: | https://eprints.iisc.ac.in/id/eprint/83465 |
Actions (login required)
View Item |