Kevin Davies : Two weeks ago, I was making conversation with guests at the Burrill Personalized Medicine Meeting in San Francisco, when I casually asked Dietrich Stephan (co-founder of Navigenics) what he’d been up to lately.

Stephan replied that he was involved in a couple of exciting new ventures. One was a still unnamed company just getting off the ground in the genome interpretation space. The other was a “4th-generation” nanopore sequencing start-up called Genia. He asked if I’d like to meet the CEO, Stefan Roever, and the next day, we were holding an interesting conversation in the hotel bar about Genia’s platform and prospects. The full story can be found over at bio-itworld.com today.

Roever says Genia can be considered a 4th-generation platform as follows: “If Ion Torrent (electrical detection but requiring amplification) and Pacific Biosciences (single-molecule but optical) are 3rd-generation [sequencing technologies], then we're 4th-generation (single molecule, electrical detection). That's the holy grail, because it combines low-cost instruments with simple sample prep. So we'd like to think of it as last-gen!”

Roever has enjoyed considerable success in his entrepreneurial career, but has never taken on something like Genia. The touted hallmarks of the platform include high parallelization, a base/second read out, and control over DNA traversing the nanopores. These are still very early days, however, and it will likely be a year or two before we see much in the way of real sequence data, let alone a working prototype instrument.

That hasn’t stopped Life Technologies from placing a sizeable bet (Stephan says “double digit” millions of dollars) on Genia, as the sequencing giants jockey for the next big (or little) thing in next-gen sequencing (NGS).

Speaking at the Burrill meeting in early October, Life Technologies CEO Greg Lucier was asked about the status of single-molecule sequencing technologies such as the “StarLite” program (brought about by the union of acquired technologies from Quantum Dot and VisiGen), which was presented at a couple of high-profile conferences in 2010. Lucier replied that single-molecule sequencing was still very active in the company, but if used, it would likely be on a derivative of the Ion Torrent chip.

Whether Genia will be a major player in the 4th-generation of NGS platforms will be a fascinating question in the years ahead. Still, the arrival of a new player is to be welcomed. As far as new NGS platforms go, the field has been in a bit of a lull lately, waiting impatiently for the long-anticipated launch of new platforms from the likes of Oxford Nanopore and NABsys.

At CHI’s NGx conference in Providence, RI, last month, there was a pair of intriguing presentations from GnuBio and another nanopore technology, NobleGen. Another company, Intelligent Bio-Systems, has also been demoing its Max-Seq Genome Sequencer instrument (distributed by Azco Biotech).

Keith Robison :(Part 2 of 2) The following joint session was on next-next-generation sequencing, the possible technologies of the future. Two of the talks were intriguing but describing very distant futures: single-molecule reversible terminator sequencing with optical tweezers and an optical variant on nanopore sequencing. The talk I would have love to have yanked into my session was a natural follow-on, with John Healy of GnuBIO describing their progress on a platform they hope to launch in the first half of next year.

In my talk I described a process of PCR amplifying samples in a thermocycler and then shipping them off to a vendor to run on the PGM. Barcoding was a natural inclination, given a sometime need to analyze multiple samples and amortize the run cost over those many samples. Gnu’s vision compresses this and simplifies it, as a single instrument both performs the amplifications and analyzes them, returning to the user a list of differences rather than data which must be processed further. All from a box slated to cost $50K and with consumables in the $50-$100 per sample range with sample-to-data times projected at 3 hours.

Gnu’s approach is based on picodroplet technology and continues the trend of pushing most of the guts of the technology into the disposable cartridge. Microfluidic channels in the cartridge convert samples to large droplets which are merged with droplets bearing primer pairs; imagine one of those droplet-generating desk toys shrunk a few zillion-fold. A serpentine channel moves each such amplification droplet through hot and cold zones to perform PCR. Each droplet is then split further into smaller droplets, which are in turn merged with detection droplets. Each detection droplet contains a specific hexameric detection probe, and is also color-coded. After performing a primer extension reaction, the color codes and signal are read. All of the microfluidics are in the disposable card, with new applications requiring new card designs.

The detection chemistry is a simple variant on sequencing-by-hybridization (SBH). Hybridized hexameric probes are extended, resulting in the activation of a dye molecule. GNU is aiming this towards the resequencing market, which greatly simplifies SBH. By examining both the positive and negative signals and comparing against the known sequences of the amplicons, variants can be detected as unusual signatures. This is very much easier than attempting de novo SBH, which must reconstruct the sequence from the signals. Furthermore, the system should be robust to a certain degree of false positive and false negative probes; the complete ensemble of probe signals can be used to weight various models of what sequence the amplicon contains.

Note also that the signals are purely binary; presence of two copies of a hexamer does not lead to a different signal from an amplicon containing only one. The “read length” of the system is currently approaching a kilobase, but it is important to note the differences from reads on other platforms. SBH cannot read any repeat longer than the probe length (6), and it is easy to construct variant scenarios that cannot be distinguished. Read lengths here are really amplicon lengths, with the limitation being the complexity of the probe library required. Healy proposed that another four-fold increase in amplicon size by lengthening the probes to heptamers and using four cards, each with a different quarter of the library, to sequence these. But, while these are important caveats, they simply rule out a small number of genotyping or mutation detection applications. Also, in the cancer space one generally works with FFPE DNA, which is inherently fragmented to about 450 base pairs; GNU is already reading much longer than this.

A number of strengths from this approach are apparent. In my PGM experiments differential primer performance led to different sensitivities for different amplicons; in GNU’s approach each amplicon is analyzed in its own picodroplet, eliminating cross-talk. The proposed cost of the consumables pretty much eliminates the temptation to bar-code; why bother when the run cost is so small? However, on the negative side the current system has a minor allele frequency somewhere around 20-30%, about that of Sanger sequencing. This is because each original amplification droplet is expected to carry many copies of each amplification template. However, in response to my question on this topic, Healy commented that by diluting the sample so each droplet contains about half a genome, then very high sensitivity should be achievable. It is also easy to imagine how the system could be used for other detection modalities, such as single-base extension or a probe ligation assay or just a probe-hybridization assay. The system would also naturally be able to slide into digital PCR. Could COLD-PCR be adapted to this platform? That would require the hinted common thermal profile across amplicons, but once this is accomplished it would appear to be a natural fit and another possible approach to detecting rare alleles.

Whether any or all of this will come to pass will depend on successful launch of their platform next year. Until then and probably continuing afterwards, many will be performing experiments similar to mine to develop the next generation of cancer mutation assays, as well as using rapid platforms such as PGM and MiSeq to follow-up on discoveries from germline and somatic exome or whole-genome sequencing.

Keith Robison :(Part 1 of 2) I had an opportunity recently to chair a session at CHI’s Applying Next-Generation Sequencing meeting in Providence RI which was a nice microcosm of cancer genomics, covering in three talks one important corner of that vast space. I’ll also steal a bit from a talk in the joint session with Next-Generation Data Management which followed, as if I had been an omniscient and all-powerful chair, I would have yanked it into my session.

Giulia Fabbri of Columbia University led off by describing a successful scan of chronic lymphocytic leukemia (CLL) patient samples to identify recurrent causative mutations. Using exome sequencing on five CLL samples, a set of 48 candidate mutations were identified, which were then validated on a larger panel. The big fish netted by this campaign are activating mutations in NOTCH1. The most common type of NOTCH1 mutation seen is interesting, as they are frameshifts near the C-terminus, which is the location of a destabilizing domain that targets NOTCH1 protein to the proteasome. Hence, the mutations remove a negative influence, enabling a higher degree of oncogenic signaling. Other NOTCH1 mutations seen in CLL apparently have a similar effect, as Fabbri reported that so far there seems to be no difference in the clinical course of the truncating mutations versus other mutations. Any NOTCH1 mutation appears to be a serious negative prognostic for the patient. One striking bit of luck is that the frequency of NOTCH1 in CLL is about 1%, which means a sample of 5 samples could well have not included this mutation. Further studies will use larger cohorts to scan for other mutations, as well as cohorts focused on more aggressive forms of the disease.

Mike Makrigiorgos of the Dana Farber followed with a discussion of his COLD-PCR technique and its extension to second-generation sample preparation. COLD-PCR is a twist on standard PCR in which each cycle contains two extra steps which result in perfectly matched sequences staying double-stranded and encouraging mutation-bearing fragments to be single-stranded. This is much like the reverse of consensus methods in synthetic biology for reducing errors; whereas in gene synthesis the common, correct matches are desired, in hunting rare mutations from clinical cancer samples there is a desire to prejudice the amplification against the wild-type. Makrigiorgos showed data on the Illumina platform in which samples with a minor allele frequency of a few percent were buried in noise with standard PCR, but became very strong signals with COLD-PCR sample preparation. Many of the questions after the talk were aimed at better understanding the degree to which each amplicon must have a custom COLD-PCR protocol; the early experiments benefitted from a thermocycler capable of running each well on a different program. But, Makrigiorgos gave brief sketches of extending COLD-PCR to emulsion PCR and hinted at changes which will allow each sample to be run in a common thermocycling scheme.

The final talk of the session was given by me describing my work recently at Infinity Pharmaceuticals on using PCR and the Ion Torrent PGM platform to search for mutations. I described some of the successes and challenges I’ve discovered, some of which will be familiar to readers of my Omics! Omics! Blog. An attraction of the platform is the ability to quickly design and validate assays, We’ve had success detecting mutations spiked into samples at less than 1% frequency, can detect many alleles simultaneously and generally observed good linearity between the predicted number of sequencing reads and the observed number of reads. The sensitivity we’ve observed on PGM is much better than Makrigiorgos was reporting for Illumina, though other groups have achieved very high sensitivity on the Illumina as well with standard PCR. On the other hand, we’ve also had great variability in the number of reads coming off the instrument and sometimes the reads decay badly before they get to the region bearing the interesting alleles. Attempts to barcode samples were successful, but different barcodes sometimes amplified at wildly different frequencies and a first attempt to correct this did not yield much improvement. Primer designs for the same amplicon also behaved wildly different, suggesting that empirical validation will be necessary. We also observed some unexpected KRAS alleles in a mixed cell line experiment; some of these appear to be noise that may result in different sensitivities for different alleles, whereas another unexpected allele appears to be the result of an unusual compound heterozygote in KRAS.

I also displayed one case of trying to find a 5-bp deletion with the PGM which was complicated by the homopolymer-calling errors on that platform. While the most commonly called allele was the correct one, several other deletion alleles were called in the neighborhood which could be rationalized as results of homopolymer miscalls. Also, inconsistency in the alignment of some nucleotides contributed to confusion in this case. For example, the COSMIC-annotated deletion and the most common call were not identical, but given the sequence were equivalent. This points the need for better indel-calling software which can collapse such results. One general advantage of the amplicon approach is it is not wedded to a platform; once MiSeq is available it should be straightforward to convert these amplicons to ones suitable for that instrument.

To summarize so far, Fabbri led us off with finding mutations by exome sequencing and then being faced with the challenge of scanning a larger sample cohort to assess frequency and relation to clinical characteristics. Performing that scanning, and later routine testing in the clinic, is most commonly via PCR. Makrigiorgos proposed COLD-PCR as an approach to enrich that signal, since tumor specimens are always heterogeneous. I described a more head-on assault using Ion Torrent, though COLD-PCR might be an interesting twist, except when quantitative results are required (any allele-biased enrichment, by definition, distorts the allele frequency).