ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Turbo-charging vertical mining of large databases

Shenoy, Pradeep and Haritsa, Jayant R and Sudarshan, S and Bhalotia, Gaurav and Bawa, Mayank and Shah, Devavart (2000) Turbo-charging vertical mining of large databases. In: International Conference on Management of Data, MAY 16-18, 2000, Dallas, Texas, pp. 22-33.

[img] PDF
Turbo.pdf - Published Version
Restricted to Registered users only

Download (322kB) | Request a copy
Official URL: http://delivery.acm.org/10.1145/340000/335376/p22-...

Abstract

In a vertical representation of a market-basket database, each item is associated with a column of values representing the transactions in which it is present. The association-rule mining algorithms that have been recently proposed for this representation show performance improvements over their classical horizontal counterparts, but are either efficient only for certain database sizes, or assume particular characteristics of the database contents, or are applicable only to specific kinds of database schemas. We present here a new vertical mining algorithm called VIPER, which is general-purpose, making no special requirements of the underlying database. VIPER stores data in compressed bit-vectors called " snakes" and integrates a number of novel optimizations for efficient snake generation, intersection, counting and storage. We analyze the performance of VIPER for a range of synthetic database workloads. Our experimental results indicate significant performance gains, especially for large databases, over previously proposed vertical and horizontal mining algorithms. In fact, there are even workload regions where VIPER outperforms an optimal, but practically infeasible, horizontal mining algorithm.

Item Type: Conference Paper
Publication: ACM SIGMOD Record
Publisher: ACM
Additional Information: Copyright of this article belongs to ACM.
Department/Centre: Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre
Date Deposited: 20 Nov 2009 06:48
Last Modified: 19 Sep 2010 05:01
URI: http://eprints.iisc.ac.in/id/eprint/18308

Actions (login required)

View Item View Item