ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Model-Based Best Arm Identification for Decreasing Bandits

Takemori, S and Umeda, Y and Gopalan, A (2024) Model-Based Best Arm Identification for Decreasing Bandits. In: 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024, 2 May 2024through 4 May 2024, Valencia, pp. 1567-1575.

[img] PDF
Pro_of_mac_lea_res.pdf - Published Version
Restricted to Registered users only

Download (583kB) | Request a copy

Abstract

We study the problem of reliably identifying the best (lowest loss) arm in a stochastic multi-armed bandit when the expected loss of each arm is monotone decreasing as a function of its pull count. This models, for instance, scenarios where each arm itself represents an optimization algorithm for finding the minimizer of a common function, and there is a limited time available to test the algorithms before committing to one of them. We assume that the decreasing expected loss of each arm depends on the number of its pulls as a (inverse) polynomial with unknown coefficients. We propose two fixed-budget best arm identification algorithms � one for the case of sparse polynomial decay models and the other for general polynomial models � along with bounds on the identification error probability. We also derive algorithm-independent lower bounds on the error probability. These bounds are seen to be factored into the product of the usual problem complexity and the model complexity that only depends on the parameters of the model. This indicates that our methods can identify the best arm even when the budget is smaller. We conduct empirical studies of our algorithms to complement our theoretical findings. Copyright 2024 by the author(s).

Item Type: Conference Paper
Publication: Proceedings of Machine Learning Research
Publisher: ML Research Press
Additional Information: The copyright for this article belongs to ML Research Press.
Keywords: Artificial intelligence; Budget control; Inverse problems; Stochastic systems, Error probabilities; Expected loss; Fixed budget; Identification algorithms; Low-loss; Model-based OPC; Multiarmed bandits (MABs); Optimization algorithms; Stochastics; Unknown coefficients, Polynomials
Department/Centre: Division of Electrical Sciences > Electrical Communication Engineering
Date Deposited: 13 Aug 2024 06:25
Last Modified: 13 Aug 2024 06:25
URI: http://eprints.iisc.ac.in/id/eprint/85280

Actions (login required)

View Item View Item