The above work forms the basis for my proposed future research, “Foundation of Information Directed Molecular Technology: Programming Nucleic Acid Self-Assembly”. The proposal is ultimately inspired by biology’s remarkable demonstration of information technology and molecular technology: the information stored in a tiny genome specifies the development of a gigantic molecular machine, i.e. a living organism. As a molecular engineer with formal training in biology (B.S., M.S.) and computer science (Ph.D.), it is my natural passion to integrate present molecular technology with information technology. My vision for the resultant Information Directed Molecular Technology is as follows: by programming a user-friendly “molecular controller”, a human can freely specify his or her functional needs in the larger molecular world. His or her desire could be to construct a solar energy harvester, a molecular factory for producing an anti-malaria drug, or a molecular predator that navigates to and destroys cancer cells.

In pursuit of information directed molecular technology, I will use information in a user-friendly fashion to direct kinetic self-assembly of DNA/RNA structures and devices, and exploit these structures/devices to do useful molecular work, e.g. organizing functional molecular materials (e.g. carbon nanotubes, proteins), probing and programming biological processes for bioimaging and therapeutic applications. Specifically, my research will focus on three directions.

Aim 1: Developmental self-assembly for molecular construction. Based on my work on programming the kinetic pathways for self-assembling DNA structures (“Programming Biomolecular Self-Assembly Pathways”, Yin et al, Nature, 2008), I will develop a new paradigm in synthetic molecular selfassembly, which I call developmental self-assembly. Following explicitly designed kinetic pathway, a synthetic molecular structure grows in an isothermal, kinetically controlled fashion, like a living organism develops from a genome. This paradigm is fundamentally different from and conceptually more powerful than the current dominant paradigm based on the thermodynamic design of the target structure.

As the growth process, or the kinetic self-assembly pathway, can be carefully laid out, we will be able to construct complex 3D structures (e.g. a ball in a cage) that are difficult to access by traditional thermal annealing. Further, my vision is that the growing molecular structure can respond to molecular environments and differentiate; they can talk to each other, cooperate, and coevolve; their growth can be regulated by molecular circuits, or assisted by synthetic molecular motors; computation and algorithms can be used to direct their growth. In short, they grow like biology. Once reaching their mature status, they can be static target structures (e.g. a sophisticated 3D shape), or, alternatively, they can function as continuous molecular machines. Three concrete intermediate goals are: growing arbitrary wire-frame structure, engineering differentiation, and embedding computation and algorithms in the growth process.

By interfacing with the larger molecular world (e.g. materials, biology), the new paradigm of development self-assembly promises numerous applications in nanotechnology, materials science, and biomedical sciences. For example, the dynamic process of developmental molecular growth could enable the construction of novel molecular instrumentation and therapeutics devices for biomedical applications, as described in Aim 2. In addition, thanks to the rich attachment chemistry of nucleic acids, the novel self-assembled DNA structures can serve as scaffolds for organizing functional molecular entities with nanometer precision, e.g. proteins, gold nanoparticles, quantum dots, and carbon nanotubes. Finally, as
DNA duplexes and tubes can be metalized into nanowires, novel molecular electronic components and circuits could be fabricated through controlled metalization of the assembled DNA structures. The paradigm of developmental self-assembly will open new horizons in synthetic molecular self-assembly, with profound technology implications.

Aim 2: Engineering molecular devices to probe and program biology. Life at its finest scale can be viewed as dynamic self-assembling molecular systems. A crucial goal in my vision for information directed molecular technology is to interface the synthetic nucleic acid devices with biological molecular systems. A molecular device can take an input from a human user (e.g. in the form of light, small molecules) or from the biological world (e.g. detection of a target mRNA, a target protein), process the input, and produce an observable readout to a human user (e.g. a fluorescence signal) or actuate a biological response (e.g. trigger an antisense response). By probing and directing the dynamical molecular behaviors in biological systems, these devices promise powerful experimental and therapeutic tools that could have transformative impact on synthetic biology, systems biology, developmental biology, clinical biology, and structural biology research.

A representative device is a conditional gene silencer: if gene A is detected, silence independent gene B via triggered RNA interference. This simple device could open numerous doors for biological research and for therapeutic applications. For example, in the case gene B encodes a repressor for a green fluorescent protein, the function becomes: if gene A is detected, increase green fluorescence signal, enabling fluorescent imaging of gene A. This technology has key conceptual advantages over present GFP technology as detailed in the full proposal. In the case gene A encodes a disease marker and gene B encodes a housekeeping gene, the function becomes: if a disease marker is detected, kill the cell, enabling powerful disease diagnosis and treatment with single-cell precision.

Another device is a triggered protein organizer, which implements the function: if the light of a particular wavelength or a target mRNA is detected, arrange proteins of interest into specified spatial patterns. This technology could provide a powerful tool for human users to manipulate the spatial organization of proteins in real time using light, or allow the flexible logical linking of arbitrary gene expression to the organization of an independent set of proteins. The spatial organization of proteins in turn could have important biological consequences. For example, the proteins that comprise a signaling pathway are often organized by scaffold proteins, and modifying the spatial organization of the signaling proteins changes the cellular behavior. In the case where the proteins are fluorescent proteins (FP), the presence of a target mRNA could be detected either through increased fluorescence intensity from the aggregation of the same color FPs, or from emergent fluorescence correlation or Förster resonance energy transfer (FRET) resulted from co-localization of distinct FPs.

Other examples include an in situ signal amplifier for gene expression imaging, a triggered geometrical barcode, and in vivo gene expression recorders.

Aim 3: Molecular foundation for molecular programming. The goal here is to transform molecular engineering to computer programming. This will be achieved through inventing novel functional molecular motifs, developing associated molecular programming languages, and constructing “molecular compilers” that can translate the high-level programs to low level implementations. Ultimately, I envision we could design complex molecular systems just as we write computer programs, using similar user-friendly high-level languages and eventually achieving similar information complexity.