Actor-critic algorithms for hierarchical Markov decision processes

Bhatnagar, Shalabh and Panigrahi, Ranjan J (2006) Actor-critic algorithms for hierarchical Markov decision processes. In: Automatica, 42 (4). pp. 637-644.

PDF
Actor-critic_algorithms_for_hierarchical_Markov_decision_processes.pdf
Restricted to Registered users only
Download (263kB) | Request a copy

Abstract

We consider the problem of control of hierarchical Markov decision processes and develop a simulation based two-timescale actor-critic algorithm in a general framework. We also develop certain approximation algorithms that require less computation and satisfy a performance bound. One of the approximation algorithms is a three-timescale actor-critic algorithm while the other is a two-timescale algorithm, however, which operates in two separate stages. All our algorithms recursively update randomized policies using the simultaneous perturbation stochastic approximation (SPSA) methodology. We briefly present the convergence analysis of our algorithms. We then present numerical experiments on a problem of production planning in semiconductor fabs on which we compare the performance of all algorithms together with policy iteration. Algorithms based on certain Hadamard matrix based deterministic perturbations are found to show the best results.

Item Type:	Journal Article
Publication:	Automatica
Publisher:	Elsevier
Additional Information:	This Copyright belongs to Elsevier.
Keywords:	Hierarchical decision making;Learning algorithms;Markov decision processes;Stochastic approximation;Optimal control
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	20 Apr 2006
Last Modified:	19 Sep 2010 04:25
URI:	http://eprints.iisc.ac.in/id/eprint/6304

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India