ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Optimal Recommendation to Users that React: Online Learning for a Class of POMDPs

Meshram, Rahul and Gopalan, Aditya and Manjunath, D (2017) Optimal Recommendation to Users that React: Online Learning for a Class of POMDPs. In: 55th IEEE Conference on Decision and Control (CDC), DEC 12-14, 2016, Las Vegas, NV, pp. 7210-7215.

[img] PDF
2016_iee_55th_Con_Dec_Con_7210.pdf - Published Version
Restricted to Registered users only

Download (372kB) | Request a copy

Abstract

We describe and study a model for an Automated Online Recommendation System (AORS) in which a user's preferences can be time-dependent and can also depend on the history of past recommendations and play-outs. The three key features of the model that makes it more realistic compared to existing models for recommendation systems are (1) user preference is inherently latent, (2) current recommendations can affect future preferences, and (3) it allows for the development of learning algorithms with provable performance guarantees. The problem is cast as an average-cost restless multi-armed bandit for a given user, with an independent partially observable Markov decision process (POMDP) for each item of content. We analyze the POMDP for a single arm, describe its structural properties, and characterize its optimal policy. We then develop a Thompson sampling-based online reinforcement learning algorithm to learn the parameters of the model and optimize utility from the binary responses of the users to continuous recommendations. We then analyze the performance of the learning algorithm and characterize the regret. Illustrative numerical results and directions for extension to the restless hidden Markov multi-armed bandit problem are also presented.

Item Type: Conference Proceedings
Additional Information: 55th IEEE Conference on Decision and Control (CDC), Las Vegas, NV, DEC 12-14, 2016
Department/Centre: Division of Electrical Sciences > Electrical Engineering
Depositing User: Id for Latest eprints
Date Deposited: 10 Jun 2017 04:41
Last Modified: 10 Jun 2017 04:41
URI: http://eprints.iisc.ac.in/id/eprint/57205

Actions (login required)

View Item View Item