A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery

Wednesday, February 23, 2011

Has Ion Torrent Taken A 318-Sized Lead over MiSeq?

About a week ago, Ion Torrent's President Greg Fergus and Head of Marketing Manesh Jain were kind enough to engage me in a nearly an hour of discussion about the Ion Torrent platform. One agreement prior to our discussion is that I would withhold this piece until their announcement today of the 318 chip for the system (I also volunteered to let them see a draft of this in advance to ensure I had not misrepresented anything).

A key theme on their side is a certain degree of feeling that the wrong questions are being asked in the analysis of PGM versus MiSeq -- and an eagerness to shift the discussion. They wished to emphasize a number of points, and after the discussion I can see the validity of many of these.
First, is the sample prep question. MiSeq is generally viewed as faster and requiring less hands-on time in this department, due to the bridge PCR approach versus Ion Torrent's emPCR (as well as, for genomic fragment libraries, the Nextera transposons vs. conventional shear-and-ligate library construction). A clear message from the Ion Torrent team is they are working intensely to cut down the current 6 hour time to more on the order of three hours. Some of this comes from simplifying certain steps (the emulsion breaking), and some from cutting down the number of PCR cycles. They also expressed plans to have an integrated prep device by year's end to cut down hands-on time. Furthermore, they pointed out that one scientist can process 6-8 samples off-line from the sequencer and in parallel as the sequencer is running; MiSeq can prep only one at a time and the bridge PCR is integrated with the sequencer, preventing off-line usage. The Ion Torrent team stated a goal of making it "irrelevant" to the user whether they were using bridge PCR (Illumina) or emPCR (PGM) in terms of hands-on time or difficulty.

Second, they wanted to contrast the advantage of working on a very new system ("just being born") versus one which may be approaching maturity. Given that they have just started trying to optimize the various parameters (read length, cycle time, accuracy, etc.), there is much to be done. In general, they are trying to tackle one major issue at a time, to avoid over-extending themselves. The current ~100 nucleotide read length requires a two hour run; with further adjustments they expect to have 200 basepair runs in the wild later this year but keeping the 2 hour time. This will be valuable as they push read lengths farther to perhaps 400 bases or beyond at 100 bases/hr.

The contrast they laid out, and I've pondered exploring before, is that Illumina may be reaching maturity. Given that the current paired end sequencing covers the entire insert, longer reads won't really add value unless the insert sizes are increased, which could have other ramifications (for example, longer inserts are reputed to give fatter, less intense clusters). Illumina has mentioned increasing the flowcell surface size, but that would seem to enable only a relatively modest increase and certainly not orders of magnitude. Packing clusters in tighter and/or ordering them would be another route, with significant challenges. I'm not claiming Illumina is done innovating on their platform, but it would seem that the slope of their improvement curve is unlikely to be very steep.

On the front of accuracy, Ion Torrent made the claim of 6X higher accuracy at the 100th base than Illumina. Overall, they state a raw accuracy of 99.5% (phred 23). On the homopolymer issue, they mentioned data presented at AGBT showing high consensus accuracy in reading runs of 8 to 9 bases. A more stringent test would be the detection of indels in such a run; I hope to run such an experiment in the not too distant future. GC bias is claimed to be minimal; this is ascribed to the use of native nucleotides (though some GC bias can enter through PCR). Something to explore: whether emPCR is less subject to this than cluster generation by bridge PCR).

I also got some better clarity on how one might build out a larger facilty of Ion Torrents. You need one $16.5K Ion Server (Linux box, with lots of storage) for every three $50K PGMs. The current emulsion prep device is quite inexpensive ($1K) and isn't used very long per sample; one of these could handle a small army of PGMs. So the entry price for a PGM is about $68K, which requires saving a few more pennies than the off-touted $50K price but is quite a bit under the $125 for MiSeq.

One other frustration I've had is finding important details for experiment design, in particular the sequences one would design into fusion primers. They apologized for this and stated it was due to still changing sequences on the beads; all of the necessary protocols would be in an online Ion Community in a matter of weeks. I also brought up the constellation of third-party library prep and enrichment tools customized for Illumina; they are working with many partners to bring those to the platform.

Of course, their big excitement was around the new 318 chip, which will generate about 1Gbase using 4-8 million reads of length 100-200. This would, of course, directly challenge the output of the MiSeq, but in a long work day from input DNA to data (with current sample prep) or perhaps by then just a long day. It's a bit challenging to make an apples to apples comparison, but to get 1Gb with 2x150 bases on MiSeq is projected at 27 hours.

Let's see the ramifications of this. Suppose a PGM with a 318 is put in a race against a MiSeq where the staff works a strict 8 hour day and the project is sequencing PCR amplicons with appropriate fusion primers and making the pessimistic assumption that Ion Torrent's sample prep is still 6 hours.

On Monday, both racers start. The Ion Torrent prep takes most of the day whereas the sample is loaded on the MiSeq and runs. However, Ion Torrent loads at the end of day for an unattended run into the night.

On Tuesday, the MiSeq finishes its first sample around lunchtime. A second sample is loaded onto MiSeq which will run until late afternoon on Day 3. In the eight hours in between the Ion Torrent entry bumps off 4 more samples plus loads another for into the night.

On Wednesday, Ion Torrent finishes the batch before lunch with two morning runs. MiSeq accepts sample #3 in the late afternoon -- but it will finish after closing time on Thursday. Even if someone stays late for 5 minutes of loading, sample #4 would need to go into late in the evening of Friday. If there is nobody coming in on the weekend or Friday nights, then Mon-Tue-Wed-Thu of the next week will be needed to finish the remaining samples.

Now, I've deliberately set up a particular comparison and due to instrument differences, they aren't completely comparable. MiSeq will have generated 2x150 runs rather than 1x100 -- though again, some of that sequence overlaps, which generates greater confidence but lower sampling. Perhaps some read length could be sacrificed to achieve 24 hour run times, fitting the very rigid employee schedule. For some applications, a race of Illumina in 3-hour 1x35 mode might be appropriate (but this time giving the data edge to Ion Torrent). On the flip side, an intensely-anxious graduate student could pull a PGM all-nighter and sprint through a bunch of samples. Also, the announced cost of a MiSeq is somewhat higher than a 2 PGM full install ($118K) the Ion Torrent team would finish before tea time on Tuesday.

But it does illustrate what the Ion Torrent folks tried to highlight: PGM is potentially better suited to marching through samples in a given time, though with more hands on time. Also, if you were going for high-throughput then the same tech could presumably be prepping another batch of eight samples on Tuesday to keep the PGM humming. I can start picturing how a quite small lab with 2-3 techs and 1-3 PGMs could achieve a steady-state of about 4 Gbp/day -- with perhaps 8-16 Gbp possible with shorter cycle times and longer read lengths. HiSeq 2000 is spec-ed currently at 25Gb/day (clearly with much less hands on activity), giving some idea of the possible throughput. I've been scribbling scenarios right-and-left (none of which are quite apples-to-apples comparisons), which I'll try to surface later this week.

What would the 318 chip be good for? At 1Gb, it's probably just a little short for the typical 50Mb human exome kit -- though that could be either covered in two runs or by the read length going in the 200-300 range (yielding 2X-3X the data I've been penciling in). But, for a more focused custom design it could work well. Also, that would be a yeast-sized genome at 80+X. With around 10M reads, it's getting into the neighborhood needed for a lot of counting applications, such as ChIP-Seq or RNA-Seq. So it won't be ideal for every experiment, but will be a plausible option for many.

So, I guess I'm back to "ping" in my mental table tennis on these two platforms (But who knows? Perhaps someone from the competition with illuminate me as to why I should push my focus back to MiSeq). I think the MiSeq will be a valuable addition to the roster, particularly for shops very invested in Illumina technology, but it would appear that Ion Torrent is poised for enormous growth. MiSeq will also be favored for groups anticipating running smaller batches of sequences with far less hands on time. If Ion Torrent can launch a chip (presumably named 320) by end-of-year with 10Gb per run, then PGM really does start challenging HiSeq 2000 (though with much more labor), though HiSeq may go to about 1000Gb per 8 day run this year. But, a 100Gb chip in summer 2012 would actually approach one human genome per run (the promised l200-400bp reads by then would push it over). Ion Torrent also needs to drive hard to get all the subsidiary kits operating with PGM -- mate pairs, hybridization selection, etc.

Ion Torrent would definitely need a 320 chip to really go head-to-head with HiSeq 2000 on cost. But, with the 318 chip it will approach the weekly throughput (assuming 8-hour workdays) of the venerable GAIIx -- again, with GAIIx running unattended. I doubt many shops will plan to run their PGMs full tilt like that, but as a burst capability that is quite impressive. And for a service provider looking to provide inexpensive, rapid-turnaround sequencing, a fleet of PGMs could enable a very flexible toolset to handle projects.

In any case, the arrival of more PGMs in the field should start being reflected in more information on sites such as SEQAnswers. Plus, one of the winners of the European PGM giveaway is part of the blogging team at Pathogens: Genes and Genomics. And, I have been contacted by a service provider who is planning to launch a PGM service (atop their already successful second-gen service using another platform) about two months from you -- and I'm hoping to be very early in their queue (plus I think I've found a core lab that will run outside jobs). Looking forward to some regular real-world updates on the system.

12 comments:

Can't say that I am too excited about the 318 chip. Oh, the chip itself is fine. But announcing now (Feb) that the chip will be available in 7 months (Sep) and only then for "early access" seems to me to be grasping at straws in order to make the Ion Torrent seem to be useful and productive. So much can change in those 7+ months. Might as well announce the arrival of the iPad3 in September -- vapor is vapor.

I have to agree with Rick, a killer move would be to announce the chip well before it is anticipated and deliver it the following week. I still like he PGM approach, the issue however is that Life Tech has had a dreadful track record at execution (you may argue not so much with Solid but reading all the single molecule PR is truly scary). Also, multiple sample work and quick turn around is definitely an important need out there. A lot of behemoth centers might just get in trouble by massive approaches, yet the simple chemistry approach on the PGM doesn't give a lot of room for MID usage right now... So there are still challenges, I'm all for PGM because I think this is how next gen should truly be, it just needs to be more sophisticated and less corporate mumbo jumbo.

I am never surprised about the implication that Illumina has reached its peak and now some new technology is going to take over. As a researcher it comes across as a marketing tactic rather than a scientific truth. All one had to do was attend AGBT to see the real truth on the Ion Torrent and that was there wasn't much to see. Their claims are based on their internal data and not coming from the researchers.

The entry price (1 PGM, 1 server) in China is being quoted at 135K USD, almost twice the US price. Why? They won't explain. Also, it's not clear that their emPCR process is able to accept amplicons of over 100 bp length, or at least that is what their slides are saying. This obviously must be changing, since they are claiming longer read lengths, but my lab wants to use amplicons of 707bp length (we only need a few hundred to be read, but that's our amplicon). Will that be possible? Not clear at all.

The current 1.5M well chip gives <100K reads with <50% mapping. Why would a 15M well chip be expected to give >1M reads? Have they explained why the current chip performance hasn't increased, they just keep promising more and more wells. If 7% sequence and 3% have quality data, they need a lot more wells to compete with MiSeq and never with a HiSeq.

There are a lot of reasons amplicon size is limited, just think back to 454 without the amplification of signal with the enzyme cascade. How many large templates fit on a bead?

This PGM system will be superceded before it has chance to gain significant market traction in exactly the same way the 454 system was pushed out by Solexa. Poor Jonathan, he will end up with egg on his face.

Your comparison of the daily workflow is a bit disingenuous. No matter what LIFE says, they will not get ePCR and enrichment under 6 hours (or if they do, their insert lengths will not be longer than ~100bp), nor will it be efficiently automated by a device whose cost is remotely compatible with being an accessory to a $50k PGM. (maybe 10 of them).

The comparison here also neglects to really appreciate the cost of having a skilled tech banging through ePCRs CONSTANTLY (and not cross contaminating, etc). LIFE has tried to pull the classic marketing tactic of convincing people that their greatest weakness is actually a strength. Assuming the library prep sides are equal (which again, they're not with Nextera but...), the ePCR vs. drop-library-into-MiSeq is a no-brainer. What else could your frazzled grad student be doing with their time?

Also, while I can only guess what it would be, ILMN is certainly not going to be content with "maturing".

Time will tell, I suppose, however given the track record of the ePCR workflow vs. cluster generation...my money is on ILMN.

I‘m not convinced by the MiSeq. 'Proven technology': I can see a lot of potential issues by not using a laser, but LEDs. Looking at track records as was mentioned before. Look at 2010 and the HiSeq promises. Many labs are faced now with an expensive (late delivered) machine that doesn't perform as was promised, lots of issues with the flowcells , reproducible runs, reagent delivery, support, not to speak about the poor accuracy compared to other platforms. Even the GA, which is supposed to be the same 'proven' technology has better quality reads: not mentioning the HiSeq's poor data quality over 70bp. Doesn’t look good for the ‘new’ MiSeq… At AGBT the Ion Torrent was seen as a great tool, presented by a number of early access labs. Good quality data, 3 to 4 fold higher output than the specified 10Mb with quite amazing accuracy at such an early stage. The best reason for me to go for the Ion Torrent (PGM) is the fact that it I available now, bearing in mind the delivery times on the HiSeq: Who knows when a (viable) MiSeq would be available? 2012? No contest for me here: Ion Torrent all the way!!!

I am sorry, but the data quality shown at AGBT wasn't that great and same for accuracy. I can certainly agree with you on Illumina but your claim as far as AGBT dataset is pure fiction... In fact I saw one slide that specifically showed a cell below signal specifications (understandable, early adopter). If it was so good, Life would have data files of any simple model organism. I guess I cannot condemn anon marketing input in a blog because it would simply decrease the flow of information, but we are scientists here, data doesn't lie... I am sure PGM will be great, just show us how. I am mature enough to understand why early data is not so great, IT was mostly a technology driven company with little chemistry, now Life will likely change that... Let's stick to facts.

When "IT's confidence they can bring these improvements to market" depends on their receiving $350M, everything they say without data to back it up is suspect. This is the biggest conflict of interest I have seen in the genomics field.

The media should be reporting on published data, not treating their crafted press releases as gospel truth. Whether they get the $350M or not, I bet the reports come back to reality when real versus press release data is shown. What ever happened to peer review?

"Over the past few years, Life Technologies (LIFE - Analyst Report) has been expanding its product portfolio through acquisitions; the latest being privately held Ion Torrent, a DNA sequencing company, for $375 million in cash and stock. In addition, Life is also liable to pay $350 million if certain milestones are met through 2012."

Ion Torrent is doing a good job of promoting and selling the trajectory....but where's the beef? Just because one lab is able to achieve something doesn't mean it should be on the Ion torrent spec sheet.I think IT/Life is following the same path as Solid--they never did meet their specs, and continue to overpromise and underdeliver. Plus the IT is very much like a 454, but uses a PH meter rather than a light for detection. IT and 454 are both bead based technologies, and there are inherent problems with bead based technologies. Solid, too.

Wait and see...is the best approach for me. On the LEDs--in real time PCR--they outperformed laser technology consistently in accuracy and sensitivity and are easier to maintain- far better technology in my opinion. If I were to place a bet--I would bet Illumina.

Sanger got a 1.6 Gb out of their very first Run on their spanking new Miseq, just after it was installed, on time and tested, successfully. So using the glitches Hiseq had at launch knowing that nothing out there ever matched its throughput in highly unfair.

Furthermore I can't see where LEDs could be a problem. On the contrary, being less aggressive on clusters one can hope the polymerase survives longer and yields reads longer that 150bp. Which would solve once and for all Solexa's "short read" issue.

Follow by Email

Search This Blog

About Me

Dr. Robison spent 10 years at Millennium Pharmaceuticals working with various genomics & proteomics technologies & working on multiple teams attempting to apply these throughout the drug discovery process. He spent 2 years at Codon Devices working on a variety of protein & metabolic engineering projects as well as monitoring a high-throughput gene synthesis facility. After a brief bit of consulting, he rejoined the cancer drug discovery field at Infinity Pharmaceuticals in May 2009. In September 2011 he joined Warp Drive Bio, a startup applying genomics to natural product drug discovery. Other recurring characters in this blog are his loyal Shih Tzu Amanda and his teenaged son alias TNG (The Next Generation).
Dr. Robison can be reached via his Gmail account, keith.e.robison@gmail.com
You can also follow him on Twitter as @OmicsOmicsBlog.