ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Schedule Based Temporal Difference Algorithms

Deb, R and Gandhi, M and Bhatnagar, S (2022) Schedule Based Temporal Difference Algorithms. In: 58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022, 27 - 30 September 2022, Monticello.

[img]
Preview
PDF
ALLERTON_2022.pdf - Published Version

Download (391kB) | Preview
Official URL: https://doi.org/10.1109/Allerton49937.2022.9929388

Abstract

Learning the value function of a given policy from data samples is an important problem in Reinforcement Learning. TD (lambda) is a popular class of algorithms to solve this problem. However, the weights assigned to different n -step returns in TD (lambda), controlled by the parameter lambda, decrease exponentially with increasing n. In this paper, we present a lambda -schedule procedure that generalizes the TD (lambda) algorithm to the case when the parameter lambda could vary with time-step. This allows flexibility in weight assignment, i.e., the user can specify the weights assigned to different n -step returns by choosing a sequence lambdattgeq 1. Based on this procedure, we propose an on-policy algorithm - TD (lambda)text- schedule, and two off-policy algorithms - GTD (lambda) -schedule and TDC (lambda) -schedule, respectively. We provide proofs of almost sure convergence for all three algorithms under a general Markov noise framework. © 2022 IEEE.

Item Type: Conference Paper
Publication: 2022 58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to Author(S).
Keywords: Parameter estimation, Almost sure convergence; Data sample; Lambda's; Markov noise; Reinforcement learnings; Temporal-difference algorithm; Time step; Value functions; Weight assignment, Reinforcement learning
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 09 Jan 2023 09:08
Last Modified: 09 Jan 2023 09:08
URI: https://eprints.iisc.ac.in/id/eprint/78937

Actions (login required)

View Item View Item