Panahi, A and Rahbar, A and Bhattacharyya, C and Dubhashi, D and Haghir Chehreghani, M (2022) Analysis of Knowledge Transfer in Kernel Regime. In: 31st ACM International Conference on Information and Knowledge Management, CIKM 2022, 17 - 21 October 2022, Atlanta, pp. 1615-1624.
|
PDF
31st ACM_CIKM 2022_1615-1624_2022.pdf - Published Version Download (1MB) | Preview |
Abstract
Knowledge transfer is shown to be a very successful technique for training neural classifiers: together with the ground truth data, it uses the "privileged information"(PI) obtained by a "teacher"network to train a "student"network. It has been observed that classifiers learn much faster and more reliably via knowledge transfer. However, there has been little or no theoretical analysis of this phenomenon. To bridge this gap, we propose to approach the problem of knowledge transfer by regularizing the fit between the teacher and the student with PI provided by the teacher. Using tools from dynamical systems theory, we show that when the student is an extremely wide two layer network, we can analyze it in the kernel regime and show that it is able to interpolate between PI and the given data. This characterization sheds new light on the relation between the training error and capacity of the student relative to the teacher. Another contribution of the paper is a quantitative statement on the convergence of student network. We prove that the teacher reduces the number of required iterations for a student to learn, and consequently improves the generalization power of the student. We give corresponding experimental analysis that validates the theoretical results and yield additional insights.
Item Type: | Conference Paper |
---|---|
Publication: | International Conference on Information and Knowledge Management, Proceedings |
Publisher: | Association for Computing Machinery |
Additional Information: | The copyright for this article belongs to Association for Computing Machinery. |
Keywords: | Classification (of information); Dynamical systems; Knowledge management; Network layers; Personnel training, Dynamical system theory; Ground truth data; Kernel regime; Knowledge transfer; Learn+; Neural classifiers; Student network; Teachers'; Training errors; Two-layer network, Students |
Department/Centre: | Division of Electrical Sciences > Computer Science & Automation |
Date Deposited: | 29 Nov 2022 09:27 |
Last Modified: | 29 Nov 2022 09:27 |
URI: | https://eprints.iisc.ac.in/id/eprint/77888 |
Actions (login required)
View Item |