Prashanth, LA and Prasad, HL and Bhatnagar, Shalabh and Chandra, Prakash (2016) A constrained optimization perspective on actor-critic algorithms and application to network routing. In: SYSTEMS & CONTROL LETTERS, 92 . pp. 46-51.
PDF
Sys_Con_Let_92_46_2016.pdf - Published Version Restricted to Registered users only Download (480kB) | Request a copy |
Official URL: http://dx.doi.org/10.1016/j.sysconle.2016.02.020
Abstract
We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application. (C) 2016 Elsevier B.V. All rights reserved.
Item Type: | Journal Article |
---|---|
Publication: | SYSTEMS & CONTROL LETTERS |
Publisher: | ELSEVIER SCIENCE BV |
Additional Information: | Copy right for this article belongs to the ELSEVIER SCIENCE BV, PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS |
Keywords: | Actor-critic algorithm; Reinforcement learning; Constrained optimization |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation |
Date Deposited: | 08 Jul 2016 06:43 |
Last Modified: | 06 Nov 2018 07:39 |
URI: | http://eprints.iisc.ac.in/id/eprint/54151 |
Actions (login required)
View Item |