This is the first time I am posting here. In our lab, we mostly cluster with a library concentration of 10pM. However, we have a few libraries that we tried to cluster at 10pM but they showed so high cluster densities in the first base reports that we had to abort the run. The concentrations were all fine 'cause we re-ran the libraries on Bioanalyzer to make sure we had the right quantification.

My question is that the concentration of a library on which to cluster it depends on the kind of library as well, what could be specific of a library that decides this concentration. Or in other words, if I have two libraries, and I sequence both at 10pM so that one gives an optimal cluster density while the other is just too high, then what could be difference between the types/make of these two libraries?

The Bioanalyzer will count DNA that will not form clusters but this will result in over estimating the library concentration. Which is an issue as mentioned above but in this case the library concentration was under estimated. In this case there may be a significant amount of DNA running above where you quantitate your smear from DNA molecules annealing at the adapters.

Bottom line, library quantitation is more of an art then a science at this point. The Bioanalyzer in my opinion is not the best way to quantitate libraries. I would avoid Qubit as well as significant amounts of your library may be ssDNA. Your best bet is qPCR but even this sometimes needs to be scaled by a 'magic number' to the type of library.

We found that a combination of Qubit, Bioanalyser and knowledge about the sample type seems to work fairly well for us. Qubit should only really measure dsDNA with minimal ssDNA crosstalk. ssDNA may show in a DNA Bioanalyser trace but run faster.

Were these TruSeq libraries? We always get better clustering form the TruSeq kits for some reason so we've dropped the amount we load of all TruSeq assays by 2pM to compensate.

The problem with Qubit and Bioanalyzer is that the default for most people is to over amplify their libraries. This results in DNA molecules that are hybridized at the adapters but single stranded in the middle. These molecules run aberrantly on the Bioanalyzer and will not be quantitated properly by Qubit because most the insert section is single stranded. Qubit and Bioanalyzer will work as long as you are careful not to over-amplify your libraries.

The problem with Qubit and Bioanalyzer is that the default for most people is to over amplify their libraries. This results in DNA molecules that are hybridized at the adapters but single stranded in the middle. These molecules run aberrantly on the Bioanalyzer and will not be quantitated properly by Qubit because most the insert section is single stranded. Qubit and Bioanalyzer will work as long as you are careful not to over-amplify your libraries.

I can't see how you could get this. Are you saying that you get primers annealing to the denatured libraries but not extending during PCR?

I would argue that more amplification actually gives you more accurate quantification as the PCR itself is an enrichment for fragments with adapters at both ends. Unfortunately more PCR leads to more bias and a potentially a higher removal of duplicate sequences.

The problem with Qubit and Bioanalyzer is that the default for most people is to over amplify their libraries. This results in DNA molecules that are hybridized at the adapters but single stranded in the middle. These molecules run aberrantly on the Bioanalyzer and will not be quantitated properly by Qubit because most the insert section is single stranded. Qubit and Bioanalyzer will work as long as you are careful not to over-amplify your libraries.

The Qubit part stands to reason, but I'm not sure about the bioanalyzer part. When we ran the denatured RNA size standard on a DNA chip our results were unexpected to me. Instead of migrating much more quickly than a double stranded molecule of the same length (but twice the molecular weight), they either ran at about the same speed (200, 500 and 1000 nucleotide ladder fragments ran at 236, 426 and 902 bp on a high sensitivity DNA chip) or much slower (2000 and 4000 nucleotide ladder fragments ran at 10 kb and 26 kb).

Since Illumina libraries tend to be 500 bp or less, I don't expect the single stranded region to have much effect on their migration. Modulo my misinterpreting the High Sensitivity DNA chip results. HESmith suggested that the ssRNA ladder was overloaded and this may have altered its migration. (You might take this as evidence that the "bubble product" hypothesis for the cause of double peaking is wrong. If "daisy-chaining" is the cause of the double peak, then the library need not be single stranded.)

Back to the Qubit. Double-strand specific fluorimetry fluors would tend to drastically underestimate the titre of a heavily "bubble-product" libraries. Do people see that? We use qPCR now -- no fluorimetry. But this does sound like a major issue.

I can't see how you could get this. Are you saying that you get primers annealing to the denatured libraries but not extending during PCR?

I would argue that more amplification actually gives you more accurate quantification as the PCR itself is an enrichment for fragments with adapters at both ends. Unfortunately more PCR leads to more bias and a potentially a higher removal of duplicate sequences.

What happens when the PCR goes for too many cycles is the PCR primers become limiting or used up completely and the amplified DNA molecules bind to each other by their complementary adapter sequences. However, the insert sequences are not complementary so they remain single-stranded. Kinetics wins over thermodynamics here. The complexity of the inserts is too high for them to find their perfectly matched partner in our lifetime. This is what Phillip is talking about when he mentions 'daisy chains' and bubble shaped molecules.

I can't see how you could get this. Are you saying that you get primers annealing to the denatured libraries but not extending during PCR?

Hi Tony,
Ethan is alluding to the "double peaking" phenomenon one frequently sees with TruSeq libraries. I think the physical cause of this phenomenon is open to interpretation. But the general consensus is that the apparently high molecular weight peak that appears with extra cycles of enrichment PCR is a "bubble product". That is, two disparate library molecules annealed at their adapters, but not along their middle -- where they share no sequence similarity. Ostensibly these form when the concentration of primers drops below a threshold where they compete effectively against product-product annealing events. As to why they migrate more slowly, well one might speculate that the ss region causes more "drag" during electrophoresis.

I favor the "daisy-chain" hypothesis. This would just be two double stranded molecules annealed at one (adapter) end. In a sense they really are twice the molecular weight.

The Qubit part stands to reason, but I'm not sure about the bioanalyzer part.

Since Illumina libraries tend to be 500 bp or less, I don't expect the single stranded region to have much effect on their migration.

Actually, on second thought, I have this wrong. Each single strand seems to migrate at the same molecular weight as a double stranded molecule. So two largely single stranded molecules annealed only at their terminii might be expected to run at roughly 2x the molecular weight.

The libraries we used were Illumina libraries and we ran them again at 3 pm and they appeared fine. So, if the libraries were over-amplified, they would create the bubble structure, but they still would not be quantitatively underestimated - right?

That's a big drop 10pM to 3pM. Do you know what the cluster density was on your 3pM run?
We have some TruSeq libraries that over clustered recently and it looks like we need to run them at a much lower concentration. Our FAS suggested to reduce concentration by at least 30%.
We've requantified again and it seems our Qubit and Bioanalyser traces were correct. We even double checked by qPCR against a library we already sequenced and results were similar. We're not sure why these libraries are over clustering.

Ya, thats how big a difference it was.. and it has been quantitated multiple times to give the same results. And over-clustering is exactly what we have been going through. Maybe we should do a qPCR to see the difference. But why would a qPCR show a higher concentration than what Bioanalyzer does? I mean, what DNA would Bioanalyzer not take into account?

We have some TruSeq libraries that over clustered recently and it looks like we need to run them at a much lower concentration. Our FAS suggested to reduce concentration by at least 30%.
We've requantified again and it seems our Qubit and Bioanalyser traces were correct. We even double checked by qPCR against a library we already sequenced and results were similar. We're not sure why these libraries are over clustering.

What type of qPCR? If SYBR green you can still run into adapter dimer issues. See this for example of some libraries that show absolutely no adapter dimer on a high sensitivity DNA chip, but some when strand denatured and run on an RNA pico chip. The library portrayed in the link above was fine -- the adapter dimer amounts were low. But we have seen other libraries where the adapter dimers (or maybe they were primer dimers) were a substantial molar contributor to the library. As a result we overclustered them by quite a bit initially.

Since SYBR green qPCR relies on an average construct size to determine concentration, having even 10% (by mass) of your library as adapter dimer can really through your clustering off.

Oh, also we noticed that ethidium bromide seemed to wreck picogreen fluorimetry entirely and appeared to throw off qPCR as well. We frequently do a Pippin prep size selection on libraries, so this is an issue for us.

Finally, you are aware that something about the TruSeq library method causes them to out "perform" other libraries as far as clusters/pmol of library added, since you mention this above. Possibly the extra magic comes some modification to the enrichment PCR primers?

At first I thought we didn't quantify our libraries correctly. I ran Kapa and Agilent qPCR kits on a set of samples and feel our quantification was correct. Fresh 2 nM dilutions made using qPCR results and still gave too high clusters.

What we think is happening is that through HiSeq software upgrades, more clusters are being called but we are not being told to adjust our input concentration to still hit the target number of clusters. We can all discuss the initial quantitation methods but assume if the sample pM is correct there can be another factor (ie cluster calling). I think the cluster densities are becoming a moving target. I discussed with Illumina that pulling more data out of the flow cell is only helpful if it is quality data, not if we still should always hit less than ~800K/mm2.

Our interpretation could be totally off, I would be happy to hear others experience. There is also the thought that some over PCR'd RNA samples have a population of ssDNA that don't qPCR quantify correctly (comparing to a fully dsDNA ladder) but still seed clusters. It could be a combination of factors, but I think the suggested pM load should be reduced.

It stands to reason and experience with other electrophoresis platforms. But this did not seem to be the case when we tried it. Almost looks like bioanalyzer chips are "tuned" to yield approximately the same length whether running single stranded or double stranded polynucleotides--at least at the smaller sizes. But it looks like somewhere above 1000 nucleotides the single stranded molecules run much, much more slowly (larger) than double stranded molecules.

But why would a qPCR show a higher concentration than what Bioanalyzer does? I mean, what DNA would Bioanalyzer not take into account?

Realtime PCR?
Short (adapter dimer or primer dimer) single stranded molecules that have annealed to a longer (library) template molecule. Even SYBR green qPCR may be confounded by this effect. See this for a case we ran into. Note that the bioanalyzer trace in the middle of the post is strand denatured library run on an RNA pico chip. Hence it was possible to visualize the, otherwise hidden, dimers.

We tested ssDNA (primer size, <100 nt) on the Bioanalyzer DNA assays, and it does run slower than dsDNA of the same size (roughly -25-30%).
Also, the quantification for the ssDNA is off (may be around 50%).
So, for sure you are able to see ssDNA on the Bioanalyzer DNA assays.