GATK Showcase in Terra

Check out these fully configured workspaces to test drive the Best Practices pipelines and workshop tutorials with zero installation required!

Get email notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements on the blog, by following the instructions given here.

Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. When reporting a problem, include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.

My questions are:
1. Is the above workflow reasonable/correct for what I'm trying to do?
2. Is there any difference running samples one pair at a time, or running them all together? (I have 57 pairs. Should I do 57 runs of normal-tumor pairs, or 1 run of all 57 pairs?)

Answers

"We (GATK docs team) are working on some docs for the somatic variant calling use case. In a nutshell, you'll need to do an additional pre-processing step called co-cleaning where you perform indel realignment on the tumor and normal in a pair together, use ContEst to estimate cross-sample contamination, use MuTect to call variants (not HC, which is not able to call low-AF variants like MuTect), do some manual filtering and processing to eliminate artifacts (VQSR is not appropriate for somatic calls) and finally annotate with Oncotator. " -vdauwera

But unsure of what I can use to do the co-cleaning. And unsure where the other steps go in my workflow. Does this mean I don't need to use Picard/Samtools, Indel Realignment, etc? Do I ONLY need to use the workflow here?

As an update (since it was pointed out to us that this post comes up when searching for GATK + cancer), we have some new workflows coming out with GATK4 enabling somatic analysis of SNPs and indels (Mutect2) and CNVs (GATK4-CNV). We'll post more details in the Best Practices section in the near future.