Determining sequence length or content in zero, one, and two dimensions

Abstract: High-throughput assays are essential for the practical application of mutation detection in medicine and research. Moreover, such assays should produce informative data of high quality that have a low-error rate and a low cost. Unfortunately, this is not currently the case. Instead, we typically witness legions of people reviewing imperfect data at astronomical expense yielding uncertain results. To address this problem, for the past decade we have been developing methods that exploit the inherent quantitative nature of DNA experiments. By generating high-quality data, careful DNA-signal quantification permits robust analysis for determining true alleles and certainty measures. We will explore several assays and methods. In a one-dimensional readout, short tandem repeat (STR) data display interesting artifacts. Even with high-quality data, PCR artifacts such as stutter and relative amplification can confound correct or automated scoring. However, by appropriate mathematical analysis, these artifacts can be essentially removed from the data. The result is fully automated data scoring, quality assessment, and new types of DNA analysis. These approaches enable the accurate analysis of pooled DNA samples, for both genetic and forensic applications. On a two-dimensional surface (comprised of zero-dimensional spots) one can perform assays of extremely high-throughput at low cost. The question is how to determine DNA sequence length or content from nonelectrophoretic intensity data. Here again, mathematical analysis of highly quantitative data provides a solution. We will discuss new lab assays that can produce data containing such information; mathematical transformation then determines DNA length or content.