Multimedia  

 

Volume 24 Issue 2-3 - Publication Date: 1 February-March 2005
 
A Deterministic Optimization Approach to Protein Sequence Design Using Continuous Models
 
S. Koh Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, 19104-6315, USA, G.K. Ananthasuresh Mechanical Engineering and Applied Mechanics, University of Pennsylvania, Philadelphia, 19104-6315, USA, and Mechanical Engineering, Indian Institute of Science, Bangalore 560 012, India and S. Vishveshwara Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
 
Determining the sequence of amino acid residues in a heteropolymer chain of a protein with a given conformation is a discrete combinatorial problem that is not generally amenable for gradient-based continuous optimization algorithms. In this paper we present a new approach to this problem using continuous models. In this modeling, continuous "state functions" are proposed to designate the type of each residue in the chain. Such a continuous model helps define a continuous sequence space in which a chosen criterion is optimized to find the most appropriate sequence. Searching a continuous sequence space using a deterministic optimization algorithm makes it possible to find the optimal sequences with much less computation than many other approaches. The computational efficiency of this method is further improved by combining it with a graph spectral method, which explicitly takes into account the topology of the desired conformation and also helps make the combined method more robust. The continuous modeling used here appears to have additional advantages in mimicking the folding pathways and in creating the energy landscapes that help find sequences with high stability and kinetic accessibility. To illustrate the new approach, a widely used simplifying assumption is made by considering only two types of residues: hydrophobic (H) and polar (P). Self-avoiding compact lattice models are used to validate the method with known results in the literature and data that can be practically obtained by exhaustive enumeration on a desktop computer. We also present examples of sequence design for the HP models of some real proteins, which are solved in less than five minutes on a single-processor desktop computer. Some open issues and future extensions are noted.
 
Return to Contents