ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

CLAP: A web-server for automatic classification of proteins with special reference to multi-domain proteins

Gnanavel, Mutharasu and Mehrotra, Prachi and Rakshambikai, Ramaswamy and Martin, Juliette and Srinivasan, Narayanaswamy and Bhaskara, Ramachandra M (2014) CLAP: A web-server for automatic classification of proteins with special reference to multi-domain proteins. In: BMC BIOINFORMATICS, 15 .

[img] PDF
bmc_bio_15_2014.pdf - Published Version
Restricted to Registered users only

Download (548kB) | Request a copy
Official URL: http://dx.doi.org/ 10.1186/1471-2105-15-343

Abstract

Background: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. Results: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions. Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. Conclusions: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.

Item Type: Journal Article
Publication: BMC BIOINFORMATICS
Publisher: BIOMED CENTRAL LTD
Additional Information: Copyright for this article belongs to the BIOMED CENTRAL LTD, 236 GRAYS INN RD, FLOOR 6, LONDON WC1X 8HL, ENGLAND
Keywords: Alignment-free comparison; Domain architectures; Multi-domain proteins; Protein classification
Department/Centre: Division of Biological Sciences > Molecular Biophysics Unit
Date Deposited: 20 Dec 2014 05:33
Last Modified: 20 Dec 2014 05:33
URI: http://eprints.iisc.ac.in/id/eprint/50484

Actions (login required)

View Item View Item