Feature Search in the Grassmanian in Online Reinforcement Learning

Bhatnagar, Shalabh and Borkar, VS and Prabuchandran, KJ (2013) Feature Search in the Grassmanian in Online Reinforcement Learning. In: IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 7 (5). pp. 746-758.

PDF
Ieee_Jou_Sel_Top_Sig_Pro_7-5_746_2013.pdf - Published Version
Restricted to Registered users only
Download (2MB) | Request a copy

Official URL: http://dx.doi.org/10.1109/JSTSP.2013.2255022

Abstract

We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates.

Item Type:	Journal Article
Publication:	IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING
Publisher:	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Additional Information:	Copyright for this article belongs to IEEE Xplore
Keywords:	Feature adaptation; Grassman manifold; online learning; residual gradient scheme; stochastic approximation; temporal difference learning
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	25 Oct 2013 06:58
Last Modified:	27 Feb 2019 10:19
URI:	http://eprints.iisc.ac.in/id/eprint/47567

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India