ACTIVETHIEF: Model extraction using active learning and unannotated public data

Pal, S and Gupta, Y and Shukla, A and Kanade, A and Shevade, S and Ganapathy, V (2020) ACTIVETHIEF: Model extraction using active learning and unannotated public data. In: AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, 7 - 12 February 2020, New York, pp. 865-872.

PDF
aaai _2020.pdf - Published Version
Restricted to Registered users only
Download (520kB) | Request a copy

Official URL: https://doi.org/10.1609/aaai.v34i01.5432

Abstract

Machine learning models are increasingly being deployed in practice. Machine Learning as a Service (MLaaS) providers expose such models to queries by third-party developers through application programming interfaces (APIs). Prior work has developed model extraction attacks, in which an attacker extracts an approximation of an MLaaS model by making black-box queries to it. We design ACTIVETHIEF – a model extraction framework for deep neural networks that makes use of active learning techniques and unannotated public datasets to perform model extraction. It does not expect strong domain knowledge or access to annotated data on the part of the attacker. We demonstrate that (1) it is possible to use ACTIVETHIEF to extract deep classifiers trained on a variety of datasets from image and text domains, while querying the model with as few as 10-30% of samples from public datasets, (2) the resulting model exhibits a higher transferability success rate of adversarial examples than prior work, and (3) the attack evades detection by the state-of-the-art model extraction detection method, PRADA.

Item Type:	Conference Paper
Publication:	AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Publisher:	AAAI press
Additional Information:	The copyright for this article belongs to AAAI press.
Keywords:	Application programming interfaces (API); Classification (of information); Deep learning; Deep neural networks; Extraction, Active Learning; Detection methods; Developed model; Domain knowledge; Machine learning models; Model extraction; State of the art; Third parties, Learning systems
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	07 Feb 2023 06:19
Last Modified:	07 Feb 2023 06:19
URI:	https://eprints.iisc.ac.in/id/eprint/79989

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India