Meet EvE, the first universal genetic adaptor that aligns, calls, annotates, converts and interprets almost any genetic data file to EvErything. EvE includes a straightforward user interface and powerful, dynamic multithreaded processing so that using your custom EvE pipeline is effortless and fast.

EvE Premium now offers ultra-fast BWA and GATK powered by Microsoft Genomics service

With EvE Premium, you can create your own ultra-fast custom pipeline within seconds. For example, you can select BWA for alignment and SamTools for variant calling. Or you can select CutAdapt for pre-processing, BowTie2 for alignment, GATK for variant calling, SnpEff for annotation and ClinVar for interpretation with the output format being a TXT file in 23andMe format.

Once you select your pipeline, you can use defaults or easily modify almost any parameter. EvE Premium will then run your pipeline using powerful multithreaded cluster cloud-based processing.

You no longer have to worry about whether your computer hardware and servers are powerful enough to process your data. With EvE, you also don't have to worry about bandwidth charges or obscure cloud computing fees. All you need is an internet connection (such as from a laptop or mobile device) and EvE will process your genetic data.

New in EvE Premium v4: ClinVar!

Turn on the ClinVar option (located under 'Interpretation') to receive on-screen and downloadable ClinVar results.

-b, --bam-list FILE: List of input BAM files, one file per line [null]

-B, --no-BAQ: Disable probabilistic realignment for the computation of base alignment quality (BAQ). BAQ is the Phred-scaled probability of a read base being misaligned. Applying this option greatly helps to reduce false SNPs caused by misalignments.

-C, --adjust-MQ INT: Coefficient for downgrading mapping quality for reads containing excessive mismatches. Given a read with a phred-scaled probability q of being generated from the mapped position, the new mapping quality is about sqrt((INT-q)/INT)*INT. A zero value disables this functionality; if enabled, the recommended value for BWA is 50. [0]

-o, --output FILE: Write pileup or VCF/BCF output to FILE, rather than the default of standard output.

-g, --BCF: Compute genotype likelihoods and output them in the binary call format (BCF). As of v1.0, this is BCF2 which is incompatible with the BCF1 format produced by previous (0.1.x) versions of samtools.

-p, --per-sample-mF: Apply -m and -F thresholds per sample to increase sensitivity of calling. By default both options are applied to reads pooled from all samples.

-P, --platforms STR: Comma-delimited list of platforms (determined by @RG-PL) from which indel candidates are obtained. It is recommended to collect indel candidates from sequencing technologies that have low indel error rate such as ILLUMINA. [all]

GATK

Unified Genotyper:

-R: Reference File Name

-I: BAM file name

--alleles: The set of alleles at which to genotype when --genotyping_mode is GENOTYPE_GIVEN_ALLELES

--comp: Comparison VCF file

--dbsnp: -D dbSNP file

--annotation: -A: One or more specific annotations to apply to variant calls

--genotype_likelihoods_model, -glm NA Genotype likelihoods calculation model to employ -- SNP is the default option, while INDEL is also available for calling indels and BOTH is available for calling both together

--genotyping_mode, -gt_mode: Specifies how to determine the alternate alleles to use for genotyping,

--group, -G: One or more classes/groups of annotations to apply to variant calls. The single value 'none' removes the default group

--heterozygosity, -hets: Heterozygosity value used to compute prior likelihoods for any locus.

--min_indel_count_for_genotyping, -minIndelCnt: Minimum number of consensus indels required to trigger genotyping run

--min_indel_fraction_per_sample, -minIndelFrac: Minimum fraction of all reads at a locus that must contain an indel (of any allele) for that sample to contribute to the indel count for alleles

--output_mode, -out_mode: Specifies which type of calls we should output

--pair_hmm_implementation -pairHMM: The PairHMM implementation to use for -glm INDEL genotype likelihood calculations

--pcr_error_rate, -pcr_error: The PCR error rate to be used for computing fragment-based likelihoods

--sample_ploidy

--standard_min_confidence_threshold_for_calling, -stand_call_conf: The minimum phred-scaled confidence threshold at which variants should be called.

--standard_min_confidence_threshold_for_emitting, -stand_emit_conf: The minimum phred-scaled confidence threshold at which variants should be emitted (and filtered with LowQual if less than the calling threshold)

-annotateNDA, -nda: If provided, we will annotate records with the number of alternate alleles that were discovered (but not necessarily genotyped) at a given site

--computeSLOD, -slod: If provided, we will calculate the SLOD (SB annotation)

When you use EvE Premium, your data files and result file(s) will be stored in your Sequencing.com account, which provides free, unlimited storage of genetic data.

This simple yet secure approach means you no longer have to use your own storage or computing power to conduct genetic analysis and you also no longer have to use USB drives or FTP sites to move and share genetic data.

EvE includes an integration of the following:

Samtools

A suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:

GATK analyzes high-throughput sequencing data. The toolkit, which focuses on providing high quality data, offers a wide variety of tools including variant discovery and genotyping.

Isaac Variant Caller

Isaac Variant Caller (IVC) is an analysis package designed to detect SNVs and small indels from the aligned sequencing reads of a single diploid sample.

SnpEFF

SnpEFF predicts the effects of genetic variations and also provides annotations. Annotations can simple such as variant name or more complex such as site of variation such as exon, intron and its effect on gene expression.

Sequencing.com Scripts

Since UGENE offers conversions between different file formats such which is a multistep process at time, we use our own scripts to pipe data and create some file types. Specifically, the Clinical plus VCF files and gVCF to VCF conversions are executed by sequencing.com customs scripts.

Wormtable (WT)

Wormtable is a format for storing large scale tabular data and interacting with it. It generates an index file as well that can be used repeatedly. Wormtable files are considered very Python friendly and can be used in downstream Python based analysis.

GVF

GVF (Genome Variation Format) is gaining popularity as a standard for sequence ontology based datasets. It is a successor of the GFF3 format and includes pragmas for defining sequence alterations at genomic locations as compared to the reference genome.

gVCF

Human clinical applications require sequencing information for both variant and non-variant positions, yet there is currently no common exchange format for such data. Genomic VCF (gVCF) addresses this issue. gVCF is a set of conventions applied to the standard variant call format (VCF) that include genotype, annotation and other information across all sites in the genome in a reasonably compact format.

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012 Apr-Jun;6(2):80-92. doi: 10.4161/fly.19695. [PubMed]

This app is designed for researchers, bioinformatics experts and genomics professionals. The genetic analysis and statements that appear in this app have not been evaluated by the United States Food and Drug Administration. The Sequencing.com website and all software applications (Apps) that use Sequencing.com's website, as well as Sequencing.com's open Application Programming Interface (API), are not intended to diagnose, treat, cure, or prevent any disease.