Zero-shot knowledge distillation in deep networks

Nayak, GK and Mopuri, KR and Shaj, V and Venkatesh Babu, R and Chakraborty, A (2019) Zero-shot knowledge distillation in deep networks. In: 36th International Conference on Machine Learning, ICML 2019, 9 -15 June 2019, Long Beach; United States, pp. 8317-8325.

PDF
icml_8317-8325_2019.pdf - Published Version
Restricted to Registered users only
Download (1MB) | Request a copy

Official URL: http://proceedings.mlr.press/v97/nayak19a/nayak19a...

Abstract

Knowledge distillation deals with the problem of training a smaller model (Student) from a high capacity source model (Teacher) so as to retain most of its performance. Existing approaches use either the training data or meta-data extracted from it in order to train the Student. However, accessing the dataset on which the Teacher has been trained may not always be feasible if the dataset is very large or it poses privacy or safety concerns (e.g., bio-metric or medical data). Hence, in this paper, we propose a novel data-free method to train the Student from the Teacher. Without even using any meta-data, we synthesize the Data Impressions from the complex Teacher model and utilize these as surrogates for the original training data samples to transfer its learning to Student via knowledge distillation. We, therefore, dub our method "Zero-Shot Knowledge Distillation" and demonstrate that our framework results in competitive generalization performance as achieved by distillation using the actual training data samples on multiple benchmark datasets.

Item Type:	Conference Paper
Publication:	36th International Conference on Machine Learning, ICML 2019
Publisher:	International Machine Learning Society (IMLS)
Additional Information:	cited By 0; Conference of 36th International Conference on Machine Learning, ICML 2019 ; Conference Date: 9 June 2019 Through 15 June 2019; Conference Code:156104
Keywords:	Benchmarking; Distillation; Large dataset; Machine learning; Metadata; Students, Benchmark datasets; Generalization performance; High capacity; Medical data; Safety concerns; Source modeling; Teacher models; Training data, Personnel training
Department/Centre:	Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited:	04 Feb 2020 10:06
Last Modified:	04 Feb 2020 10:06
URI:	http://eprints.iisc.ac.in/id/eprint/64495

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India