Joseph, Ajin George and Bhatnagar, Shalabh (2017) A Model based Search Method for Prediction in Model-free Markov Decision Process. In: International Joint Conference on Neural Networks (IJCNN), MAY 14-19, 2017, Anchorage, AK, pp. 170-177.
PDF
Int_Joi_Con_Neu_Two_170_2017.pdf - Published Version Restricted to Registered users only Download (356kB) | Request a copy |
Abstract
In this paper, we provide a new algorithm for the problem of prediction in the model-free MDP setting, i.e., estimating the value function of a given policy using the linear function approximation architecture, with memory and computation costs scaling quadratically in the size of the feature set. The algorithm is a multi-timescale variant of the very popular cross entropy (CE) method which is a model based search method to find the global optimum of a real-valued function. This is the first time a model based search method is used for the prediction problem. A proof of convergence using the ODE method is provided. The theoretical results are supplemented with experimental comparisons. The algorithm achieves good performance fairly consistently on many benchmark problems.
Item Type: | Conference Proceedings |
---|---|
Series.: | IEEE International Joint Conference on Neural Networks (IJCNN) |
Publisher: | IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Additional Information: | Copy right for the article belong to IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA |
Department/Centre: | Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre |
Date Deposited: | 13 Apr 2018 19:56 |
Last Modified: | 23 Oct 2018 14:48 |
URI: | http://eprints.iisc.ac.in/id/eprint/59552 |
Actions (login required)
View Item |