ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Risk-Averse Combinatorial Semi-Bandits

Ayyagari, RS and Dukkipati, A (2023) Risk-Averse Combinatorial Semi-Bandits. In: IEEE International Symposium on Information Theory - Proceedings, 25 - 30 June 2023, Taipei, Taiwan, pp. 1472-1477.

[img] PDF
IEEE_ISIT2023_2023_1472-1477_2023.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: http://doi.org/10.1109/ISIT54713.2023.10206452

Abstract

In many practical sequential decision-making scenarios, we often face the problem of choosing a set of options rather than just one option. While sequential decision-making problems have been studied under a multi-armed bandit setting, much of the related literature deals with the simplest case where the agent chooses a single arm at each time step. The variant of the problem where the agent's task is to choose a set of arms is called a combinatorial multi-armed bandit. The main aim of this paper is to study risk-aware algorithms for these problems. We consider such a problem with stochastic rewards and semi-bandit feedback and propose algorithms that maximize the Conditional Value-at-Risk (CVaR), a risk measure that takes into account the worst-case rewards achieved by the agent for the two cases of Gaussian and bounded arm rewards. We further analyze these algorithms and provide regret bounds. We believe that our results provide the first theoretical insights into combinatorial semi-bandit problems in the risk-aware case. Numerical experiments corroborate our theoretical findings. © 2023 IEEE.

Item Type: Conference Paper
Publication: IEEE International Symposium on Information Theory - Proceedings
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this conference proceeding belongs to Institute of Electrical and Electronics Engineers Inc.
Keywords: Decision making; Risk assessment; Value engineering, Bandit feedbacks; Conditional Value-at-Risk; Decision-making problem; Multiarmed bandits (MABs); Risk averse; Risk aware; Sequential decision making; Simple++; Stochastics; Time step, Stochastic systems
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 24 Nov 2023 09:34
Last Modified: 24 Nov 2023 09:34
URI: https://eprints.iisc.ac.in/id/eprint/83273

Actions (login required)

View Item View Item