Patel, D and Sastry, PS (2021) Memorization in Deep Neural Networks: Does the Loss Function Matter? In: 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 11-14 May 2021, pp. 131-142.
Full text not available from this repository.Abstract
Deep Neural Networks, often owing to the overparameterization, are shown to be capable of exactly memorizing even randomly labelled data. Empirical studies have also shown that none of the standard regularization techniques mitigate such overfitting. We investigate whether choice of loss function can affect this memorization. We empirically show, with benchmark data sets MNIST and CIFAR-10, that a symmetric loss function as opposed to either cross entropy or squared error loss results in significant improvement in the ability of the network to resist such overfitting. We then provide a formal definition for robustness to memorization and provide theoretical explanation as to why the symmetric losses provide this robustness. Our results clearly bring out the role loss functions alone can play in this phenomenon of memorization. © 2021, Springer Nature Switzerland AG.
Item Type: | Conference Paper |
---|---|
Publication: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Additional Information: | The copyright for this article belongs to Springer Science and Business Media Deutschland GmbH |
Keywords: | Data mining; Deep neural networks, Benchmark data; Empirical studies; Formal definition; Loss functions; Overparameterization; Regularization technique; Squared error loss; Symmetric loss function, Neural networks |
Department/Centre: | Division of Physical & Mathematical Sciences > Centre for High Energy Physics |
Date Deposited: | 29 Nov 2021 11:18 |
Last Modified: | 29 Nov 2021 11:18 |
URI: | http://eprints.iisc.ac.in/id/eprint/70009 |
Actions (login required)
View Item |