Mapping the landscape of possible macromolecular
polymer sequences to their fitness in performing
biological functions is a challenge across the biosciences.
A paradigm is the case of aptamers,
nucleic acids that can be selected to bind particular
target molecules. We have characterized the
sequence-fitness landscape for aptamers binding
allophycocyanin (APC) protein via a novel Closed
Loop Aptameric Directed Evolution (CLADE)
approach. In contrast to the conventional SELEX
methodology, selection and mutation of aptamer
sequences was carried out in silico, with explicit fitness
assays for 44 131 aptamers of known sequence
using DNA microarrays in vitro. We capture the
landscape using a predictive machine learning
model linking sequence features and function and
validate this model using 5500 entirely separate
test sequences, which give a very high observed
versus predicted correlation of 0.87. This approach
reveals a complex sequence-fitness mapping, and
hypotheses for the physical basis of aptameric binding;
it also enables rapid design of novel aptamers
with desired binding properties. We demonstrate an
extension to the approach by incorporating prior knowledge into CLADE resulting in some of the tightest binding sequences.