A Restless Bandit With No Observable States for Recommendation Systems and Communication Link Scheduling

Meshram, Rahul and Manjunath, D and Gopalan, Aditya (2015) A Restless Bandit With No Observable States for Recommendation Systems and Communication Link Scheduling. In: 54th IEEE Conference on Decision and Control (CDC), DEC 15-18, 2015, Osaka, JAPAN, pp. 7820-7825.

PDF
54th_IEEE_Con_Dec_Con_7820_.pdf - Published Version
Restricted to Registered users only
Download (367kB) | Request a copy

Official URL: http://dx.doi.org/10.1109/CDC.2015.7403456

Abstract

A restless bandit is used to model a user's interest in a topic or item. The interest evolves as a Markov chain whose transition probabilities depend on the action ( display the ad or desist) in a time step. A unit reward is obtained if the ad is displayed and if the user clicks on the ad. If no ad is displayed then a fixed reward is assumed. The probability of click-through is determined by the state of the Markov chain. The recommender never gets to observe the state but in each time step it has a belief, denoted by pi(t); about the state of the Markov chain. pi(t) evolves as a function of the action and the signal from each state. For the one-armed restless bandit with two states, we characterize the policy that maximizes the infinite horizon discounted reward. We first characterize the value function as a function of the system parameters and then characterize the optimal policies for different ranges of the parameters. We will see that the Gilbert-Elliot channel in which the two states have different success probabilities becomes a special case. For one special case, we argue that the optimal policy is of the threshold type with one threshold; extensive numerical results indicate that this may be true in general.

Item Type:	Conference Proceedings
Additional Information:	Copy right for this article belongs to the IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA
Department/Centre:	Division of Electrical Sciences > Electrical Communication Engineering
Date Deposited:	30 Dec 2016 07:04
Last Modified:	30 Dec 2016 07:04
URI:	http://eprints.iisc.ac.in/id/eprint/55628

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India