Kumar, G and Srinivasan, N and Sandhya, S (2020) Artificial protein sequences enable recognition of vicinal and distant protein functional relationships. In: Proteins: Structure, Function and Bioinformatics .
PDF
pro_str_fun_bio_2020.pdf - Published Version Restricted to Registered users only Download (3MB) | Request a copy |
|
Microsoft Excel
prot25986-sup-0001-tables1.xls - Published Supplemental Material Restricted to Registered users only Download (313kB) | Request a copy |
|
Microsoft Excel
prot25986-sup-0002-tables2.xls - Published Supplemental Material Restricted to Registered users only Download (826kB) | Request a copy |
|
Microsoft Excel
prot25986-sup-0003-tables3.xls - Published Supplemental Material Restricted to Registered users only Download (413kB) | Request a copy |
|
Microsoft Excel
prot25986-sup-0004-tables4.xls - Published Supplemental Material Restricted to Registered users only Download (4MB) | Request a copy |
|
Microsoft Excel
prot25986-sup-0005-tables5.xls - Published Supplemental Material Restricted to Registered users only Download (1MB) | Request a copy |
|
PDF
prot25986-sup-0006-figures.pdf - Published Supplemental Material Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
High divergence in protein sequences makes the detection of distant protein relationships through homology-based approaches challenging. Grouping protein sequences into families, through similarities in either sequence or 3-D structure, facilitates in the improved recognition of protein relationships. In addition, strategically designed protein-like sequences have been shown to bridge distant structural domain families by serving as artificial linkers. In this study, we have augmented a search database of known protein domain families with such designed sequences, with the intention of providing functional clues to domain families of unknown structure. When assessed using representative query sequences from each family, we obtain a success rate of 94 in protein domain families of known structure. Further, we demonstrate that the augmented search space enabled fold recognition for 582 families with no structural information available a priori. Additionally, we were able to provide reliable functional relationships for 610 orphan families. We discuss the application of our method in predicting functional roles through select examples for DUF4922, DUF5131, and DUF5085. Our approach also detects new associations between families that were previously not known to be related, as demonstrated through new sub-groups of the RNA polymerase domain among three distinct RNA viruses. Taken together, designed sequences-augmented search databases direct the detection of meaningful relationships between distant protein families. In turn, they enable fold recognition and offer reliable pointers to potential functional sites that may be probed further through direct mutagenesis studies. © 2020 Wiley Periodicals LLC
Item Type: | Journal Article |
---|---|
Publication: | Proteins: Structure, Function and Bioinformatics |
Publisher: | John Wiley and Sons Inc. |
Additional Information: | The copyright of this article belongs to John Wiley and Sons Inc. |
Department/Centre: | Division of Biological Sciences > Molecular Biophysics Unit |
Date Deposited: | 28 Sep 2020 11:13 |
Last Modified: | 28 Sep 2020 11:13 |
URI: | http://eprints.iisc.ac.in/id/eprint/66545 |
Actions (login required)
View Item |