Personalities

Main menu

Search form

You are here

VCF Bulk Export

This form provides filtering of existing VCF files and export into common formats. Most of the filter criteria and many of the formats are provided by VCFtools+.

Choose your VCF File.

The following table contains all the available VCF Files. Choose the one you would like to filter and export by selecting the circle at the beginning of the appropriate row.

Name

Assembly

Number of SNPs

No VCF Files available.

Specify filter criteria.

Only include Bi-allelic SNPs

If you check this checkbox, only SNPs with 2 alleles across all individuals will be kept. For example, in the example data below, SNP Chr3p34567 would be removed.

Minimum SNP Call Read Depth

Only include SNP calls that have at least the specified number of reads to support the call. For example, if you specify 5 for this filter then for SNP Chr2p25678 in the example table below, only the call for Germplasm4 will be set to missing data.

Minor Allele Frequency

Only include SNP positions with a minor allele frequency greater than or equal to this value. Allele frequency is defined as the number of times an allele appears over all individuals at that site, divided by the total number of non-missing alleles at that site. For example, consider Chr1p12344 in the example table below: the minor allele frequency for A is 2/5=40%, thus if you enter 50% this SNP position will be removed.

Maximum Missing Count

Exclude SNPs with more than this number of missing genotypes over all individuals/germplasm. For example, if you enter 1 for this filter for the example data below, only SNP Chr4p48765 would be removed.

Maximum Missing Frequency

Exclude SNPs based on the proportion of missing data. For example, if you enter 25% for this filter then for the example data below, only SNP Chr4p48765 would be removed since it has a missing data frequency of 2/6=33%.

Example Table: Example Data for Filter Explanation.

SNP Name

SNP Backbone

SNP Position

Germ1

Germ2

Germ3

Germ4

Germ5

Germ6

Chr1p12344

Chr1

12344

AA:5

TT:12

TT:15

AT:19

TT:15

Chr2p25678

Chr2

25678

GG:7

GG:13

GG:5

TT:2

GG:22

GT:24

Chr3p34567

Chr3

34567

AA:5

CC:12

AC:7

TT:15

CC:19

TC:23

Chr4p48765

Chr4

48765

CC:12

AC:7

CC:19

AA:23

* The above example will be referred to in the description of each filter criteria to aid in the explanation of how it will affect your data. NOTE: the cell for each SNP by germplasm combination contains the call and the read depth seperated by a colon (:). For example, AA:5 means a call of AA with a read depth of 5.

Pick your Export format.

Select one of the formats listed below and the filtered VCF will be converted accordingly. Keep in mind that if you choose a format with no quality information, you should have been stringent with your filtering criteria to ensure you are working with good data.

Format

Has Quality Info?

Description

A/B Format

No

Alleles are coded as A/B based on the parents. This format is only suitable for biparental crosses

Quality Matrix

Yes

Variant by Germplasm matrix of Read Depth per call.

Variant Call Format (VCF)

Yes

A variant by germplasm matrix with each cell containing a combination of SNP call and quality information. See the Specification for more information.