The ODE method for convergence of stochastic approximation and reinforcement learning

Borkar, VS and Meyn, SP (2000) The ODE method for convergence of stochastic approximation and reinforcement learning. In: SIAM Journal on Control And Optimization, 38 (2). pp. 447-469.

Preview

PDF
ODE.pdf
Download (400kB)

Abstract

It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergence of the algorithm. Several specific classes of algorithms are considered as applications. It is found that the results provide (i) a simpler derivation of known results for reinforcement learning algorithms; (ii) a proof for the first time that a class of asynchronous stochastic approximation algorithms are convergent without using any a priori assumption of stability; (iii) a proof for the first time that asynchronous adaptive critic and Q-learning algorithms are convergent for the average cost optimal control problem.

Item Type:	Journal Article
Publication:	SIAM Journal on Control And Optimization
Publisher:	Society for Industrial and Applied Mathematics
Additional Information:	Copyright of this article belongs to Society for Industrial and Applied Mathematics.
Keywords:	stochastic approximation;ODE method;stability;asynchronous algorithms;reinforcement learning
Department/Centre:	Division of Electrical Sciences > Electrical Communication Engineering
Date Deposited:	11 Oct 2004
Last Modified:	19 Sep 2010 04:15
URI:	http://eprints.iisc.ac.in/id/eprint/1737

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India