Rishi, J and Gafoor, A and Kumar, S and Subramani, D (2024) On the Training Efficiency of Shallow Architectures for Physics Informed Neural Networks. In: 24th International Conference on Computational Science, ICCS 2024, 2 July 2024 through 4 July 2024, Malaga, pp. 363-377.
PDF
24th_Int_Con_Com_Sci_ICCS_2024.pdf - Published Version Restricted to Registered users only Download (2MB) | Request a copy |
Abstract
Physics-informed Neural Networks (PINNs), a class of neural models that are trained by minimizing a combination of the residual of the governing partial differential equation and the initial and boundary data, have gained immense popularity in the natural and engineering sciences. Despite their observed empirical success, an analysis of the training efficiency of residual-driven PINNs at different architecture depths is poorly documented. Usually, neural models used for machine learning tasks such as computer vision and natural language processing have deep architectures, that is, a larger number of hidden layers. In PINNs, we show that for a given trainable parameter count (model size), a shallow network (less layers) converges faster than a deep network (more layers) for the same error characteristics. To illustrate this, we examine the one-dimensional Poisson�s equation and evaluate the gradient for residual and boundary loss terms. We show that the characteristics of the gradient of the loss function are such that for residual loss, shallow architectures converge faster. Empirically, we show the implications of our theory through various experiments. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
Item Type: | Conference Paper |
---|---|
Publication: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Additional Information: | The copyright for this article belongs to the publishers. |
Keywords: | Efficiency; Learning algorithms; Learning systems; Natural language processing systems; Network architecture; Network layers; Poisson equation, Boundary data; Deep learning; Engineering science; Learning tasks; Machine-learning; Natural languages; Neural modelling; Neural-networks; Physic-informed neural network; Training efficiency, Deep learning |
Department/Centre: | Division of Interdisciplinary Sciences > Computational and Data Sciences |
Date Deposited: | 19 Oct 2024 04:50 |
Last Modified: | 19 Oct 2024 04:50 |
URI: | http://eprints.iisc.ac.in/id/eprint/86415 |
Actions (login required)
View Item |