Gnanavel, Mutharasu and Mehrotra, Prachi and Rakshambikai, Ramaswamy and Martin, Juliette and Srinivasan, Narayanaswamy and Bhaskara, Ramachandra M (2014) CLAP: A web-server for automatic classification of proteins with special reference to multi-domain proteins. In: BMC BIOINFORMATICS, 15 .
![]() |
PDF
bmc_bio_15_2014.pdf - Published Version Restricted to Registered users only Download (548kB) | Request a copy |
Abstract
Background: The function of a protein can be deciphered with higher accuracy from its structure than from its amino acid sequence. Due to the huge gap in the available protein sequence and structural space, tools that can generate functionally homogeneous clusters using only the sequence information, hold great importance. For this, traditional alignment-based tools work well in most cases and clustering is performed on the basis of sequence similarity. But, in the case of multi-domain proteins, the alignment quality might be poor due to varied lengths of the proteins, domain shuffling or circular permutations. Multi-domain proteins are ubiquitous in nature, hence alignment-free tools, which overcome the shortcomings of alignment-based protein comparison methods, are required. Further, existing tools classify proteins using only domain-level information and hence miss out on the information encoded in the tethered regions or accessory domains. Our method, on the other hand, takes into account the full-length sequence of a protein, consolidating the complete sequence information to understand a given protein better. Results: Our web-server, CLAP (Classification of Proteins), is one such alignment-free software for automatic classification of protein sequences. It utilizes a pattern-matching algorithm that assigns local matching scores (LMS) to residues that are a part of the matched patterns between two sequences being compared. CLAP works on full-length sequences and does not require prior domain definitions. Pilot studies undertaken previously on protein kinases and immunoglobulins have shown that CLAP yields clusters, which have high functional and domain architectural similarity. Moreover, parsing at a statistically determined cut-off resulted in clusters that corroborated with the sub-family level classification of that particular domain family. Conclusions: CLAP is a useful protein-clustering tool, independent of domain assignment, domain order, sequence length and domain diversity. Our method can be used for any set of protein sequences, yielding functionally relevant clusters with high domain architectural homogeneity. The CLAP web server is freely available for academic use at http://nslab.mbu.iisc.ernet.in/clap/.
Item Type: | Journal Article |
---|---|
Publication: | BMC BIOINFORMATICS |
Publisher: | BIOMED CENTRAL LTD |
Additional Information: | Copyright for this article belongs to the BIOMED CENTRAL LTD, 236 GRAYS INN RD, FLOOR 6, LONDON WC1X 8HL, ENGLAND |
Keywords: | Alignment-free comparison; Domain architectures; Multi-domain proteins; Protein classification |
Department/Centre: | Division of Biological Sciences > Molecular Biophysics Unit |
Date Deposited: | 20 Dec 2014 05:33 |
Last Modified: | 20 Dec 2014 05:33 |
URI: | http://eprints.iisc.ac.in/id/eprint/50484 |
Actions (login required)
![]() |
View Item |