ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

New algorithms of the Q-learning type

Bhatnagar, Shalabh and Babu, Mohan K (2008) New algorithms of the Q-learning type. In: Automatica, 44 (4). pp. 1111-1119.

[img] PDF
0.pdf - Published Version
Restricted to Registered users only

Download (290kB) | Request a copy
Official URL: http://www.sciencedirect.com/science?_ob=ArticleUR...


We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state–action pairs at each instant while the second updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.

Item Type: Journal Article
Publication: Automatica
Publisher: Elsevier Science
Additional Information: Copyright of this article belongs to Elsevier Science.
Keywords: Q-learning;Reinforcement learning;Markov decision processes;Two-timescale stochastic approximation;SPSA.
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 25 Mar 2010 11:19
Last Modified: 21 Feb 2019 11:30
URI: http://eprints.iisc.ac.in/id/eprint/26525

Actions (login required)

View Item View Item