ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Clustering of multi-domain protein sequences

Mehrotra, Prachi and Ami, Vimla Kany G and Srinivasan, Narayanaswamy (2018) Clustering of multi-domain protein sequences. In: PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 86 (7). pp. 759-776.

[img] PDF
Pro_86-7_759_2018.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: https://dx.doi.org/10.1002/prot.25510


The overall function of a multi-domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence alignment-based methods commonly utilize domain-level information and provide classification only at the level of domains. Such methods are not capable of taking into account the contributions of other domains in the proteins, and domain-linker regions and classify multi-domain proteins. An alignment-free protein sequence comparison tool, CLAP (CLAssification of Proteins) was previously developed in our laboratory to especially handle multi-domain protein sequences without a requirement of defining domain boundaries and sequential order of domains. Through this method we aim to achieve a biologically meaningful classification scheme for multi-domain protein sequences. In this article, CLAP-based classification has been explored on 5 datasets of multi-domain proteins and we present detailed analysis for proteins containing (1) Tyrosine phosphatase and (2) SH3 domain. At the domain-level CLAP-based classification scheme resulted in a clustering similar to that obtained from an alignment-based method. CLAP-based clusters obtained for full-length datasets were shown to comprise of proteins with similar functions and domain architectures. Our study demonstrates that multi-domain proteins could be classified effectively by considering full-length sequences without a requirement of identification of domains in the sequence.

Item Type: Journal Article
Publisher: WILEY, 111 RIVER ST, HOBOKEN 07030-5774, NJ USA
Additional Information: Copyright of this article belong to WILEY, 111 RIVER ST, HOBOKEN 07030-5774, NJ USA
Department/Centre: Division of Biological Sciences > Molecular Biophysics Unit
Division of Physical & Mathematical Sciences > Mathematics
Date Deposited: 24 Jul 2018 15:15
Last Modified: 25 Aug 2022 11:28
URI: https://eprints.iisc.ac.in/id/eprint/60288

Actions (login required)

View Item View Item