ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

NrichD database: sequence databases enriched with computationally designed protein-like sequences aid in remote homology detection

Mudgal, Richa and Sandhya, Sankaran and Kumar, Gayatri and Sowdhamini, Ramanathan and Chandra, Nagasuma R and Srinivasan, Narayanaswamy (2015) NrichD database: sequence databases enriched with computationally designed protein-like sequences aid in remote homology detection. In: NUCLEIC ACIDS RESEARCH, 43 (D1). D300-D305.

[img] PDF
nuc_aci_res-43_D1_D300_2015.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy
Official URL: http://dx.doi.org/10.1093/nar/gku888

Abstract

NrichD <named-content content-type=''no-hyphen'' xlink:type=''simple''>( ext-link-type=''uri'' xlink:href=''http://proline.biochem.iisc.ernet.in/NRICHD/'' xlink:type=''simple''>http://proline.biochem.iisc.ernet.in/NRICHD/)< /named-content> is a database of computationally designed protein-like sequences, augmented into natural sequence databases that can perform hops in protein sequence space to assist in the detection of remote relationships. Establishing protein relationships in the absence of structural evidence or natural `intermediately related sequences' is a challenging task. Recently, we have demonstrated that the computational design of artificial intermediary sequences/linkers is an effective approach to fill naturally occurring voids in protein sequence space. Through a large-scale assessment we have demonstrated that such sequences can be plugged into commonly employed search databases to improve the performance of routinely used sequence search methods in detecting remote relationships. Since it is anticipated that such data sets will be employed to establish protein relationships, two databases that have already captured these relationships at the structural and functional domain level, namely, the SCOP database and the Pfam database, have been `enriched' with these artificial intermediary sequences. NrichD database currently contains 3 611 010 artificial sequences that have been generated between 27 882 pairs of families from 374 SCOP folds. The data sets are freely available for download. Additional features include the design of artificial sequences between any two protein families of interest to the user.

Item Type: Journal Article
Publication: NUCLEIC ACIDS RESEARCH
Publisher: OXFORD UNIV PRESS
Additional Information: Copy right for this article belongs to the OXFORD UNIV PRESS, GREAT CLARENDON ST, OXFORD OX2 6DP, ENGLAND
Keywords: HIDDEN MARKOV-MODELS; PSI-BLAST; EVOLUTIONARY INFORMATION; ARTIFICIAL SEQUENCES; FAMILIES DATABASE; SEARCH TOOL; SELECTION; SERVER; SPACE; FOLD
Department/Centre: Division of Biological Sciences > Biochemistry
Division of Biological Sciences > Molecular Biophysics Unit
Date Deposited: 21 Apr 2015 07:45
Last Modified: 21 Apr 2015 07:45
URI: http://eprints.iisc.ac.in/id/eprint/51345

Actions (login required)

View Item View Item