ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

A checkpointing algorithm for an SCI based distributed shared memory system

Kalaiselvi, S and Rajaraman, V (1999) A checkpointing algorithm for an SCI based distributed shared memory system. In: Microprocessors and Microsystems, 22 (9). pp. 515-522.

[img] PDF
sdarticle.pdf - Published Version
Restricted to Registered users only

Download (347kB) | Request a copy
Official URL: http://www.sciencedirect.com/science?_ob=MImg&_ima...

Abstract

Distributed Shared Memory (DSM) systems combine the ease of programming of shared memory parallel computers and scalability of message passing multicomputers. IEEE has proposed an interface standard known as SCI standard to construct DSM systems. When the number of processors in a parallel computer increase it is imperative to build fault tolerance. This article presents an algorithm for checkpointing and rollback recovery of an SCI based DSM system using the provisions of the standard. It is shown that this checkpointing and rollback recovery procedure judiciously combines the features of both shared memory and message passing distributed memory system. (C) 1999 Elsevier Science B.V. All rights reserved.

Item Type: Journal Article
Publication: Microprocessors and Microsystems
Publisher: Elsevier Science B.V.
Additional Information: Copyright of this article belongs to Elsevier Science B.V.
Keywords: Fault tolerant computing;Checkpointing;Rollback recovery;SCI standard
Department/Centre: Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre
Date Deposited: 02 Jan 2009 16:48
Last Modified: 19 Sep 2010 04:58
URI: http://eprints.iisc.ac.in/id/eprint/17722

Actions (login required)

View Item View Item