ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Qespera: an adaptive framework for prediction of queue waiting times in supercomputer systems

Murali, Prakash and Vadhiyar, Sathish (2016) Qespera: an adaptive framework for prediction of queue waiting times in supercomputer systems. In: CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 28 (9). pp. 2685-2710.

[img] PDF
Con_Com_28-9_2685_2016.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy
Official URL: http://dx.doi.org/10.1002/cpe.3735

Abstract

Production parallel systems are space-shared, and resource allocation on such systems is usually performed using a batch queue scheduler. Jobs submitted to the batch queue experience a variable delay before the requested resources are granted. Predicting this delay can assist users in planning experiment time-frames and choosing sites with less turnaround times and can also help meta-schedulers make scheduling decisions. In this paper, we present an integrated adaptive framework, Qespera, for prediction of queue waiting times on parallel systems. We propose a novel algorithm based on spatial clustering for predictions using history of job submissions and executions. The framework uses adaptive set of strategies for choosing either distributions or summary of features to represent the system state and to compare with history jobs, varying the weights associated with the features for each job prediction, and selecting a particular algorithm dynamically for performing the prediction depending on the characteristics of the target and history jobs. Our experiments with real workload traces from different production systems demonstrate up to 22% reduction in average absolute error and up to 56% reduction in percentage prediction error over existing techniques. We also report prediction errors of less than 1 h for a majority of the jobs. Copyright (c) 2015 John Wiley & Sons, Ltd.

Item Type: Journal Article
Additional Information: Copy right for this article belongs to the WILEY-BLACKWELL, 111 RIVER ST, HOBOKEN 07030-5774, NJ USA
Keywords: queue waiting time; batch queue; scheduling in supercomputers
Department/Centre: Division of Interdisciplinary Research > Supercomputer Education & Research Centre
Depositing User: Id for Latest eprints
Date Deposited: 24 Aug 2016 10:35
Last Modified: 24 Aug 2016 10:35
URI: http://eprints.iisc.ac.in/id/eprint/54410

Actions (login required)

View Item View Item