Maity, Raj Kumar and Lakshminarayanan, Chandrashekar and Padakandla, Sindhu and Bhatnagar, Shalabh (2016) Shaping Proto-Value Functions Using Rewards. In: 22nd European Conference on Artificial Intelligence (ECAI), AUG 29-SEP 02, 2016, Hague, NETHERLANDS, pp. 1690-1691.
PDF
Eur_Con_Art_Int_285_1690_2016..pdf - Published Version Restricted to Registered users only Download (554kB) | Request a copy |
Abstract
In reinforcement learning (RL), an important sub-problem is learning the value function, which is chiefly influenced by the architecture used to represent value functions. is often expressed as a linear combination of a pre-selected set of basis functions. These basis functions are either selected in an ad-hoc manner or are tailored to the RL task using the domain knowledge. Selecting basis functions in an ad-hoc manner does not give a good approximation of value function while choosing functions using domain knowledge introduces dependency on the task. Thus, a desirable scenario is to have a method to choose basis functions that are task independent, but which also provide a good approximation for the value function. In this paper, we propose a novel task-independent basis function construction method that uses the topology of the underlying state space and the reward structure to build the reward-based Proto Value Functions (RPVFs). The approach we propose gives good approximation for the value function and enhanced learning performance. The performance is demonstrated via experiments on grid-world tasks.
Item Type: | Conference Proceedings |
---|---|
Series.: | Frontiers in Artificial Intelligence and Applications |
Additional Information: | Copy right for this article belongs to the IOS PRESS, NIEUWE HEMWEG 6B, 1013 BG AMSTERDAM, NETHERLANDS |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation |
Date Deposited: | 03 Dec 2016 09:46 |
Last Modified: | 03 Dec 2016 09:46 |
URI: | http://eprints.iisc.ac.in/id/eprint/55371 |
Actions (login required)
View Item |