A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing

Jain, Shweta and Gujar, Sujit and Bhat, Satyanath and Zoeter, Onno and Narahari, Y (2017) A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing. In: ARTIFICIAL INTELLIGENCE, 254 . pp. 44-63.

PDF
ART_INT_254_44_2017.pdf - Published Version
Restricted to Registered users only
Download (1MB) | Request a copy

Official URL: http://dx.doi.org/10.1016/j.artint.2017.10.001

Abstract

There are numerous situations when a service requester wishes to expertsource a series of identical but non-trivial tasks from a pool of experts so as to achieve an assured accuracy level for each task, in a cost optimal way. The experts available are typically heterogeneous with unknown but fixed qualities and different service costs. The service costs are usually private to the experts and the experts could be strategic about their costs. The problem is to select for each task an optimal subset of experts so that the outcome obtained after aggregating the opinions from the selected experts guarantees a target level of accuracy. The problem is a challenging one even in a non-strategic setting since the accuracy of an aggregated outcome depends on unknown qualities. We develop a novel multi-armed bandit (MAB) mechanism for solving this problem. First, we propose a framework, Assured Accuracy Bandit (MB) framework, which leads to a MAB algorithm, Constrained Confidence Bound for Non-Strategic Setting (CCB-NS). We derive an upper bound on the number of time steps this algorithm chooses a sub-optimal set, which depends on the target accuracy and true qualities. A more challenging situation arises when the requester not only has to learn the qualities of the experts but has to elicit their true service costs as well. We modify the CCB-NS algorithm to obtain an adaptive exploration separated algorithm Constrained Confidence Bound for Strategic Setting (CCB-S). The CCB-S algorithm produces an ex-post monotone allocation rule that can then be transformed into an ex post incentive compatible and ex-post individually rational mechanism. This mechanism learns the qualities of the experts and guarantees a given target accuracy level in a cost optimal way. We also provide a lower bound on the number of times any algorithm must select a sub-optimal set and we see that the lower bound matches our upper bound up to a constant factor. We provide insights on a practical implementation of this framework through an illustrative example and demonstrate the efficacy of our algorithms through simulations. (C) 2017 Elsevier B.V. All rights reserved.

Item Type:	Journal Article
Publication:	ARTIFICIAL INTELLIGENCE
Publisher:	10.1016/j.artint.2017.10.001
Additional Information:	Copy right for this article belongs to the ELSEVIER SCIENCE BV, PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	13 Jan 2018 07:40
Last Modified:	23 Oct 2018 14:50
URI:	http://eprints.iisc.ac.in/id/eprint/58568

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India