Protein sequence design based on the topology of the native state structure

Jha, Anupam Nath and Ananthasuresh, GK and Vishveshwara, Saraswathi (2007) Protein sequence design based on the topology of the native state structure. In: Journal of Theoretical Biology, 248 (1). pp. 81-90.

PDF
Protein_sequence_design.pdf
Restricted to Registered users only
Download (1MB) | Request a copy

Abstract

Computational design of sequences for a given structure is generally studied by exhaustively enumerating the sequence space or by searching in such a large space, which is prohibitively expensive. However, we point out that the protein topology has a wealth of information, which can be exploited to design sequences for a chosen structure. In this paper, we present a computationally efficient method for ranking the residue sites in a given native-state structure, which enables us to design sequences for a chosen structure. The premise for the method is that the topology of the graph representing the energetically interacting neighbours in a protein plays an important role in the inverse-folding problem. While our previous work (which was also based on topology) used eigenspectral analysis of the adjacency matrix of interactions for ranking the residue sites in a given chain, here we use a simple but effective way of assigning weights to the nodes on the basis of secondary connections, along with primary connections. This indirectly accounts for the edge weight in the graph and removes degeneracy in the degree. The new scheme needs only a few multiplications and additions to compute the preferred ranking of the residue sites even for structures of real proteins of sizes of a few hundred amino acid residues. We use HP lattice model examples (for which exhaustive enumeration of sequences is practical) to validate our ranking approach in obtaining sequences of lowest energy for any H–P residue composition for a given native-state structure. Some examples of native structures of real proteins are also included. Quantitative comparison of the efficacy of the new scheme with the earlier schemes is made. The new scheme consistently performs better and with much lower computational cost. An optimization procedure is added to work with the new scheme in a few rare cases wherein the new scheme fails to provide the best sequence, an optimization procedure is added to work with the new scheme.

Item Type:	Journal Article
Publication:	Journal of Theoretical Biology
Publisher:	Elsevier
Additional Information:	Copyright of this article belongs to Elsevier.
Keywords:	Protein structure graph; Topological index; Node weights; H–P lattice model; Secondary connections Article Outline
Department/Centre:	Division of Biological Sciences > Molecular Biophysics Unit Division of Mechanical Sciences > Mechanical Engineering
Date Deposited:	01 Aug 2008
Last Modified:	19 Sep 2010 04:48
URI:	http://eprints.iisc.ac.in/id/eprint/15411

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India