Note: Heterozygote has not been considerred in the present version, thus the script here only fit for the haploid genome (mt,Y,X in male)

Test data:

A test dataset (134 whole mitochondrial sequencing data, 3 artifically-mixed samples) are publicly available from the European Nucleotide Sequence Read Archive (http://www.ebi.ac.uk/ena/) through the accession number ERP000879.

More data (double indices, paired-end) would be released soon (with the original project)

Version:

v0.7 (Sep 17 2012) The default value for -r is set to be 0 (deactivated), because we found the distribion of reads on different strands is not as random as people thought; A bug in filter_and_summary.pl is now fixed, in the previous version, when mismatch number is greater than 9, it was recognized as 0, please note that when you run BWA in pair-end mode, the mismatch number could be much greater than what you set. Distint reads number for the major allele has be to greater than 10 on each strands.

v0.6 (Jun 7 2012) A range could be given to filter_and_summary.pl (-a) in the format of ChrX:Y-X, e.g., MT:100-1500, all the following analysis will be restricted to this specific region. A bug has been fixed in Dreep_poisson.pl, Dreep_fisher.pl, if there is no reads in one direction (normally at the two ends of the reference sequence), a quality socre of "0" will be given instead of nothing.

v 0.5 (May 30 2012) Fisher exact test is now added to quantify the violation of the minor allele counts and major allele counts on different strands in Dreep_result_filter.pl -r

v 0.4 (May 7 2012) The two-tailed Fisher Exact test was substituted by One-tailed Fisher Exact test( with considering the direction of the null hypothesis). -b option is added to output the original Bias statistics rather than Phred-scaled quality score. Dreep script itself doesn't give any LLM candidates now (only log files are generated). Please use the Dreep_result_filter.pl to specify your own criteria to call LLMs. For this update, we really appreciate the comments from the 3rd reviwer that we met at Genome Biology.

v 0.3 (Apr4,2012) including an option -p in Dreep_result_filter_v2.pl to remove the mutation adjacent to an indel (either major allele or minor allele, 10bp in both directions)), thanks to the reviewer of our manuscript

v 0.2 EMP method has the similar pipeline as the POISSON method, output file have the same format. Threshold for Standard output (on Screen in Dreep_poisson.pl) were updated according to the result of NUMTs project ( MAF>=0.02,DQS>=10,Other allele frequency -other than the majority and the secondary allele- <=20%).