ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

The relationship between classification of multi-domain proteins using an alignment-free approach and their functions: a case study with immunoglobulins

Bhaskara, Ramachandra M and Mehrotra, Prachi and Rakshambikai, Ramaswamy and Gnanavel, Mutharasu and Martin, Juliette and Srinivasan, Narayanaswamy (2014) The relationship between classification of multi-domain proteins using an alignment-free approach and their functions: a case study with immunoglobulins. In: MOLECULAR BIOSYSTEMS, 10 (5). pp. 1082-1093.

[img] PDF
mol_bio_10-5_1082_2014.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy
Official URL: http://dx.doi.org/10.1039/c3mb70443b


Establishing functional relationships between multi-domain protein sequences is a non-trivial task. Traditionally, delineating functional assignment and relationships of proteins requires domain assignments as a prerequisite. This process is sensitive to alignment quality and domain definitions. In multi-domain proteins due to multiple reasons, the quality of alignments is poor. We report the correspondence between the classification of proteins represented as full-length gene products and their functions. Our approach differs fundamentally from traditional methods in not performing the classification at the level of domains. Our method is based on an alignment free local matching scores (LMS) computation at the amino-acid sequence level followed by hierarchical clustering. As there are no gold standards for full-length protein sequence classification, we resorted to Gene Ontology and domain-architecture based similarity measures to assess our classification. The final clusters obtained using LMS show high functional and domain architectural similarities. Comparison of the current method with alignment based approaches at both domain and full-length protein showed superiority of the LMS scores. Using this method we have recreated objective relationships among different protein kinase sub-families and also classified immunoglobulin containing proteins where sub-family definitions do not exist currently. This method can be applied to any set of protein sequences and hence will be instrumental in analysis of large numbers of full-length protein sequences.

Item Type: Journal Article
Additional Information: copyright for this article belongs to ROYAL SOC CHEMISTRY, THOMAS GRAHAM HOUSE, SCIENCE PARK, MILTON RD, CAMBRIDGE CB4 0WF, CAMBS, ENGLAND
Department/Centre: Division of Biological Sciences > Molecular Biophysics Unit
Date Deposited: 28 May 2014 05:47
Last Modified: 28 May 2014 05:47
URI: http://eprints.iisc.ac.in/id/eprint/49058

Actions (login required)

View Item View Item