Determining the sequence
of amino acid residues in a heteropolymer chain of a protein with a given
conformation is a discrete combinatorial problem that is not generally
amenable for gradient-based continuous optimization algorithms. In this
paper we present a new approach to this problem using continuous models.
In this modeling, continuous "state functions" are proposed
to designate the type of each residue in the chain. Such a continuous
model helps define a continuous sequence space in which a chosen criterion
is optimized to find the most appropriate sequence. Searching a continuous
sequence space using a deterministic optimization algorithm makes it possible
to find the optimal sequences with much less computation than many other
approaches. The computational efficiency of this method is further improved
by combining it with a graph spectral method, which explicitly takes into
account the topology of the desired conformation and also helps make the
combined method more robust. The continuous modeling used here appears
to have additional advantages in mimicking the folding pathways and in
creating the energy landscapes that help find sequences with high stability
and kinetic accessibility. To illustrate the new approach, a widely used
simplifying assumption is made by considering only two types of residues:
hydrophobic (H) and polar (P). Self-avoiding compact lattice models are
used to validate the method with known results in the literature and data
that can be practically obtained by exhaustive enumeration on a desktop
computer. We also present examples of sequence design for the HP models
of some real proteins, which are solved in less than five minutes on a
single-processor desktop computer. Some open issues and future extensions
are noted.