Memorization in Deep Neural Networks: Does the Loss Function Matter?

Patel, D and Sastry, PS (2021) Memorization in Deep Neural Networks: Does the Loss Function Matter? In: 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 11-14 May 2021, pp. 131-142.

Full text not available from this repository.

Official URL: https://doi.org/10.1007/978-3-030-75765-6_11

Abstract

Deep Neural Networks, often owing to the overparameterization, are shown to be capable of exactly memorizing even randomly labelled data. Empirical studies have also shown that none of the standard regularization techniques mitigate such overfitting. We investigate whether choice of loss function can affect this memorization. We empirically show, with benchmark data sets MNIST and CIFAR-10, that a symmetric loss function as opposed to either cross entropy or squared error loss results in significant improvement in the ability of the network to resist such overfitting. We then provide a formal definition for robustness to memorization and provide theoretical explanation as to why the symmetric losses provide this robustness. Our results clearly bring out the role loss functions alone can play in this phenomenon of memorization. Â© 2021, Springer Nature Switzerland AG.

Item Type:	Conference Paper
Publication:	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher:	Springer Science and Business Media Deutschland GmbH
Additional Information:	The copyright for this article belongs to Springer Science and Business Media Deutschland GmbH
Keywords:	Data mining; Deep neural networks, Benchmark data; Empirical studies; Formal definition; Loss functions; Overparameterization; Regularization technique; Squared error loss; Symmetric loss function, Neural networks
Department/Centre:	Division of Physical & Mathematical Sciences > Centre for High Energy Physics
Date Deposited:	29 Nov 2021 11:18
Last Modified:	29 Nov 2021 11:18
URI:	http://eprints.iisc.ac.in/id/eprint/70009

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India