ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

MULTI-ARMED BANDITS BASED ON A VARIANT OF SIMULATED ANNEALING

Abdulla, Mohammed Shahid and Bhatnagar, Shalabh (2016) MULTI-ARMED BANDITS BASED ON A VARIANT OF SIMULATED ANNEALING. In: INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS, 47 (2). pp. 195-212.

[img] PDF
Ind_Jou_Pur_App_Mat_47-2_195_2016.pdf - Published Version
Restricted to Registered users only

Download (360kB) | Request a copy
Official URL: http://dx.doi.org/10.1007/s13226-016-0184-5

Abstract

A variant of Simulated Annealing termed Simulated Annealing with Multiplicative Weights (SAMW) has been proposed in the literature. However, convergence was dependent on a parameter beta(T), which was calculated a-priori based on the total iterations T the algorithm would run for. We first show the convergence of SAMW even when a diminishing stepsize beta(k) -> 1 is used, where k is the index of iteration. Using this SAMW as a kernel, a stochastic multi-armed bandit (SMAB) algorithm called SOFTMIX can be improved to obtain the minimum-possible log(2) regret, as compared to log(2) regret of the original. Another modification of SOFTMIX is proposed which avoids the need for a parameter that is dependent on the reward distribution of the arms. Further, a variant of SOFTMIX that uses a comparison term drawn from another popular SMAB algorithm called UCB1 is then described. It is also shown why the proposed scheme is computationally more efficient over UCB1, and an alternative to this algorithm with simpler stepsizes is also proposed. Numerical simulations for all the proposed algorithms are then presented.

Item Type: Journal Article
Publication: INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS
Publisher: INDIAN NAT SCI ACAD, BAHADUR SHAH ZAFAR MARG, NEW DELHI 110002, INDIA
Additional Information: Copy right for this article belongs to the INDIAN NAT SCI ACAD, BAHADUR SHAH ZAFAR MARG, NEW DELHI 110002, INDIA
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 19 Aug 2016 05:46
Last Modified: 19 Aug 2016 05:46
URI: http://eprints.iisc.ac.in/id/eprint/54374

Actions (login required)

View Item View Item