ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

MECHANISMS WITH LEARNING FOR STOCHASTIC MULTI-ARMED BANDIT PROBLEMS

Jain, Shweta and Bhat, Satyanath and Ghalme, Ganesh and Padmanabhan, Divya and Narahari, Y (2016) MECHANISMS WITH LEARNING FOR STOCHASTIC MULTI-ARMED BANDIT PROBLEMS. In: INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS, 47 (2). pp. 229-272.

[img] PDF
Ind_Jou_Pur_App_Mat_47-2_229_2016.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: http://dx.doi.org/10.1007/s13226-016-0186-3

Abstract

The multi-armed bandit (MAB) problem is a widely studied problem in machine learning literature in the context of online learning. In this article, our focus is on a specific class of problems namely stochastic MAB problems where the rewards are stochastic. In particular, we emphasize stochastic MAB problems with strategic agents. Dealing with strategic agents warrants the use of mechanism design principles in conjunction with online learning, and leads to non-trivial technical challenges. In this paper, we first provide three motivating problems arising from Internet advertising, crowdsourcing, and smart grids. Next, we provide an overview of stochastic MAB problems and key associated learning algorithms including upper confidence bound (UCB) based algorithms. We provide proofs of important results related to regret analysis of the above learning algorithms. Following this, we present mechanism design for stochastic MAB problems. With the classic example of sponsored search auctions as a backdrop, we bring out key insights in important issues such as regret lower bounds, exploration separated mechanisms, designing truthful mechanisms, UCB based mechanisms, and extension to multiple pull MAB problems. Finally we provide a bird's eye view of recent results in the area and present a few issues that require immediate future attention.

Item Type: Journal Article
Publication: INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS
Additional Information: Copy right for this article belongs to the INDIAN NAT SCI ACAD, BAHADUR SHAH ZAFAR MARG, NEW DELHI 110002, INDIA
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 19 Aug 2016 05:48
Last Modified: 19 Aug 2016 05:48
URI: http://eprints.iisc.ac.in/id/eprint/54376

Actions (login required)

View Item View Item