ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Compressing XML Documents Using Recursive Finite State Automata

Subramanian, Hariharan and Shankar, Priti (2006) Compressing XML Documents Using Recursive Finite State Automata. In: Lecture Notes in Computer Science, 3845 . pp. 282-293.

[img] PDF
Restricted to Registered users only

Download (400kB) | Request a copy


We propose a scheme for automatically generating compressors for XML documents from Document Type Definition(DTD) specifications. Our algorithm is a lossless adaptive algorithm where the model used for compression and decompression is generated automatically from the DTD, and is used in conjunction with an arithmetic compressor to produce a compressed version of the document. The structure of the model mirrors the syntactic specification of the document. Our compression scheme is on-line, that is, it can compress the document as it is being read. We have implemented the compressor generator, and provide the results of experiments on some large XML databases whose DTD’s are specified. We note that the average compression is better than that of XMLPPM, the only other on-line tool we are aware of. The tool is able to compress massive documents where XMLPPM failed to work as it ran out of memory. We believe the main appeal of this technique is the fact that the underlying model is so simple and yet so effective.

Item Type: Journal Article
Publication: Lecture Notes in Computer Science
Publisher: Springer
Additional Information: Copyright of this article belongs to Springer.
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 16 May 2006
Last Modified: 19 Sep 2010 04:26
URI: http://eprints.iisc.ac.in/id/eprint/6588

Actions (login required)

View Item View Item