An Efficient Technique for Protein Sequence Clustering and Classification

Vijaya, PA and Murty, Narasimha M and Subramanian, DK (2004) An Efficient Technique for Protein Sequence Clustering and Classification. In: 17th International Conference on Pattern Recognition, 2004. ICPR 2004, 23-26 August, Cambridge,UK, Vol.2, 447-450.

Preview

PDF
an_efficient.pdf
Download (962kB)

Abstract

In this paper, a technique to reduce time and space during protein sequence clustering and classification is presented. During training and testing phase, the similarity score value between a pair of sequences is determined by selecting a portion of the sequence instead of the entire sequence. It is like selecting a subset of features for sequence data sets. The experimental results of the proposed method show that the classification accuracy (CA) using the prototypes generated/used does not degrade much but the training and testing time are reduced significantly. Thus the experimental results indicate that the similarity score need not be calculated by considering the entire length of the sequence for achieving a good CA. Even space requirement is reduced during execution phase. We have tested this using K-medians, supervised K-medians and nearest neighbour classifier (NNC) techniques.

Item Type:	Conference Paper
Publisher:	IEEE
Additional Information:	Ã�Â©1990 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Department/Centre:	Division of Electrical Sciences > Computer Science & Automation
Date Deposited:	07 Dec 2005
Last Modified:	19 Sep 2010 04:21
URI:	http://eprints.iisc.ac.in/id/eprint/4329

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India