I received an email a while back from a reader wondering why his friend has had to submit multiple saliva samples to personal genomics company 23andMe without getting a result back. Customers in a similar position may be reassured by a lengthy explanation posted yesterday on 23andMe’s blog about their sample processing protocol, penned by the company’s Director of Operations.

(Other potential customers may also be reassured to hear that this type of failure is apparently “quite rare”, although 23andMe haven’t responded yet to my queries regarding the frequency of sample failures; and that sample repeats are provided free of charge to customers.)

There are several points at which a saliva sample can fail to yield high-quality genetic data. Firstly, the saliva sample may have been compromised, either by the collection tube leaking in transit or by a failure of the preservative solution to mix with the saliva after collection. Secondly, the saliva may not contain enough useful DNA (a point I’ll return to below), or the DNA may be too degraded to use. Finally, there might be a problem with the genotyping process that converts DNA into 580,000 pieces of genetic information.

High-throughput genotyping has become so routine now (thanks to massive genome-wide association studies) that the systems for sample processing are well-developed and robust, meaning that genotyping failures are likely to be extremely rare. That means that situations where a customer has to submit multiple samples are most likely to arise due to problems with the sample itself.

If the issue is with poor sample submission (not providing enough saliva, or failure to mix the saliva with the preservative in the tube) then that’s easily fixed. But there’s a more interesting potential problem: 23andMe notes that “[s]ome people seem to have less DNA in their spit, though almost everyone has enough for the purposes of our analysis.”

It strikes me that variation in saliva DNA content could be driven by a number of different factors (for instance: variation in saliva production and composition, epithelial cell shedding, or oral microflora), all of which are likely to be determined to some extent by genetic factors. What a delightful irony it would be if there were genetic variants associated with an inability to undergo a genome scan…

(If anyone working in the field can provide good estimates of the variation in DNA content in saliva between individuals, please comment below.)

One final note while I’m on the topic of 23andMe: several people have noted that the turnaround time for the company has increased considerably in recent months, with customers now receiving emails advising of an 8-10 week wait rather than the original 4-6 weeks (e.g. here). I gather this is due to a substantial surge in demand following the appearance of the 23andMe co-founders on Oprah – so new customers are (temporarily) paying the price of 23andMe’s success.

Not for genome-wide association studies, at least in terms of creating false positives. Aggressive quality control gets rid of most poorly-performing SNPs (and poor quality DNA samples) at an early stage. Once associations are found, researchers manually inspect the raw data for the associated SNPs and pick up most obvious sources of genotyping bias (weird clustering artefacts, batch effects, etc.). Validation and independent replication in a separate cohort are then usually performed using a separate genotyping platform, with a different pattern of error.

Given all that I’d be pretty confident that essentially none of the results from recent large genome-wide association studies are false positives due to genotyping error. However, the very stringent approach taken to quality control presumably means that real signals are sometimes (often?) filtered out.

For candidate gene studies, which are still being published at an alarming rate, genotyping error is definitely still an issue – especially when the genotyping is being done by people who are not experienced in the field (e.g. many groups working in psychiatric genetics).

23 and me uses a company named DNA Genotek for their saliva sample collection kits (or at least they used to when they first started). Here is a link to a white paper DNA Genotek published on DNA yield from their saliva kits: http://www.dnagenotek.com/pdf_files/PDWP001_DNAYield.pdf
In my experience, using these same kits yielded anywhere from 0.1 ug to 200 ug. I found that DNA yield from an individual was extremely reproducible. The people in my lab whose yield was too low to genotype always gave yields too low to genotype no matter what they did.

I am using the DNAGenotek collection device that is being used by 23andMe.

The DG device (not to confuse with D&G we’ve been using is the 1 ml split collection version (maybe 23andMe is using the 2 ml version). Procedure makes it clear, how to collect the maximum of buccal cells. Not a source of error.

To this 1ml split, 1ml of lysis/preservation buffer is released when the tube is closed. This is obvious to check, not a source of error.

The device’s cap is well designed and strongly closes the tube. Not a source for error.

We have been using Agencourt GenFind or Agencourt DNAdvance for NA extractions. From the 2 ml (total volume in the device) we regularly collect 15 to 100µg of DNA total (with 3×600µl DNA extractions). But the average is more into the 30µg range with this 1 ml collection device and Agencourt GenFind.

The DNAdvance option allows us to collect more concentrated DNA at 300µg/ml average.
Very convenient in various situations.

I confirm, in general, yield is very reproductive between individuals.

My guess is 23andMe had failure in their genotyping procedures. Or, wait, maybe a PCR contamination in the whole lab? lol