ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Multi-dimensional semantic clustering of large databases for association rule mining

Ananthanarayanan, VS and Murty, Narasimha M and Subramanian, DK (2001) Multi-dimensional semantic clustering of large databases for association rule mining. In: Pattern Recognition, 34 (4). pp. 939-941.

[img] PDF
Multi-dimensional.pdf - Published Version
Restricted to Registered users only

Download (135kB) | Request a copy
Official URL: http://www.sciencedirect.com/science?_ob=ArticleUR...


Clustering is an activity of finding abstractions from data [1]. These abstractions are mainly used for: (i) identifying outliers and (ii) other decision making activities. In this paper, we propose a novel application, association rule mining (ARM), based on abstractions/cluster descriptions obtained using clustering. A majority of ARM algorithms proposed in the literature are used for mining intra-transaction association rules. Typically, the ARM algorithms find large itemsets from the transaction database based on the frequency of co-occurrence of items. ARM involves two main steps [3. A.K.H. Tung, H. Lu, J. Han, L. Feng, Breaking the barrier of transactions: mining inter-transaction association rules, Proceedings of the 1999 International Conference on Knowledge Discovery and Data Mining, August 1999.3]: (i) generating large itemsets which are having the frequency of co-occurrence (support) greater than or equal to the user defined minimum support (σ) and (ii) generation of association rules from the large itemsets which satisfy the user-defined minimum confidence (c). Mining for complete set of inter-transaction association rules along a single dimension, the time axis, is proposed in [3]. In this paper, we propose a clustering scheme based on multiple dimensions for mining a complete set of inter-transaction association rules. This scheme has two components: (i) generation of descriptions of clusters based on multi-dimensional semantic grouping; our algorithm needs at most two database scans to do this step, and (ii) exploring associations between cluster descriptions/abstractions.2. Clustering process Clustering is a subjective process. It employs knowledge to group data. Knowledge is in the form of the similarity measure used, the values assigned to parameters like the number of clusters and assumptions on the nature of clusters, and structures that capture explicit knowledge [1]. We call clustering based on knowledge, semantic clustering. In this paper, we discuss clustering based on multiple dimensions including size of transaction, cost of transaction, and their combinations to generate inter-transaction associations.

Item Type: Journal Article
Publication: Pattern Recognition
Publisher: Elsevier Science
Additional Information: Copyright of this article belongs to Elsevier Science.
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 04 Feb 2010 04:53
Last Modified: 19 Sep 2010 04:55
URI: http://eprints.iisc.ac.in/id/eprint/17160

Actions (login required)

View Item View Item