ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models

Prashanthi, SK and Kesanapalli, SA and Simmhan, Y (2022) Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models. In: Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6 (3).

[img]
Preview
PDF
ori_ACM_mea_6-3_2022.pdf - Published Version

Download (1MB) | Preview
Official URL: https://doi.org/10.1145/3570604

Abstract

Deep Neural Networks (DNNs) have had a significant impact on domains like autonomous vehicles and smart cities through low-latency inferencing on edge computing devices close to the data source. However, DNN training on the edge is poorly explored. Techniques like federated learning and the growing capacity of GPU-accelerated edge devices like NVIDIA Jetson motivate the need for a holistic characterization of DNN training on the edge. Training DNNs is resource-intensive and can stress an edge's GPU, CPU, memory and storage capacities. Edge devices also have different resources compared to workstations and servers, such as slower shared memory and diverse storage media. Here, we perform a principled study of DNN training on individual devices of three contemporary Jetson device types: AGX Xavier, Xavier NX and Nano for three diverse DNN model - dataset combinations. We vary device and training parameters such as I/O pipelining and parallelism, storage media, mini-batch sizes and power modes, and examine their effect on CPU and GPU utilization, fetch stalls, training time, energy usage, and variability. Our analysis exposes several resource inter-dependencies and counter-intuitive insights, while also helping quantify known wisdom. Our rigorous study can help tune the training performance on the edge, trade-off time and energy usage on constrained devices, and even select an ideal edge hardware for a DNN workload, and, in future, extend to federated learning too. As an illustration, we use these results to build a simple model to predict the training time and energy per epoch for any given DNN across different power modes, with minimal additional profiling.

Item Type: Journal Article
Publication: Proceedings of the ACM on Measurement and Analysis of Computing Systems
Publisher: Association for Computing Machinery
Additional Information: The copyright for this article belongs to the AuthorS.
Keywords: Digital storage; Economic and social effects; Learning systems, Dnn training; Edge accelerator; Energy usage; Learning models; Neural networks trainings; Performance; Performance characterization; Power modes; Storage medium; Training time, Deep neural networks
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 07 Feb 2023 06:56
Last Modified: 07 Feb 2023 06:56
URI: https://eprints.iisc.ac.in/id/eprint/80120

Actions (login required)

View Item View Item