The structural variant detection software can be used to find genomic structural
variants (SVs) including

insertions

deletions

inversions

tandem duplications

intra- and inter-chromosomal translocations

Detection is based on paired-end reads
from a DNA-Seq experiment. The software will identify structural variants based on abnormally
mapped read pairs, e.g. read pairs that are not within the expected distance
on the genome or where the mates have different strand orientation. To receive multiple
lines of evidence for the predicted structural variants the algorithm additionally integrates
both a read-depth-coverage approach and a split-mapping approach.

Identification of structural variant candidates:
The paired-end structural variant discovery pipeline consists of four main steps:

Filtering

Clustering

Including splice junction sequences (not yet included)

Scoring

Filtering

Read pairs are selected as candidate structural variants if both reads map uniquely to
the genome on the same or different chromosomes. Mates that align within expected
distance on the same chromosome with expected strand orientation are removed. In
addition, the program discards read pairs with mapping qualities lower than the minimum
required mapping quality. Artificial pileup reads are also removed.

Clustering

For the identification of structural variant clusters, the following criteria are used:

minimum number of non-redundant read pairs

the distance between the outmost 5' and 3' mates of both partners
of the structural variant has to be smaller than the mean plus three
standard deviations of the distance distribution.

Scoring

In order to estimate the quality of a candidate deletion, two breakpoint scores and
average coverage values for each deletion event are calculated.

The breakpoint scores describe the difference between the sequence coverage of the
left and right part of the predicted deletion. A higher breakpoint score indicates a better
quality of the deletion event.

To calculate the breakpoint scores for each deletion, in a first step the coverage values
of the deletion and its flanking regions (about 100 bp) are calculated.
Then, both breakpoint scores (ranging from 0 to 100) will be calculated as

100 * (c1 - delC) / c1
100 * (c2 - delC) / c2

where delC is the coverage value of the deletion and c1, c2 are the coverage values of
the left and right part of the deletion.

Sigma Threshold
This is the number of standard deviations used for the detection of insertions and deletions.
This parameter should be set to 3. With a lower value you will obtain more structural
variants but also more false positives. A higher value will give you less structural
variants. If it is set to -1, this value and the mean and standard deviation distance will
be calculated automatically.

Filters

Minimum Cluster Size (2-1000)
This parameter determines the minimum number of non-redundant read pairs
in one 'structural variant' cluster.

Maximum PileUp Size (1-10000)
This parameter determines the maximum allowed pileup size. A maximum pileup size
of 1 means that all redundant sequences will be discarded for structural variant discovery,
except for one.
A value of n > 1 means that all sequences which occur more than n times in the data,
will be discarded (except for pileups where the neighbor sequences are also redundant).
If the "ignore / no max." option is checked, no sequences will be removed.

Minimum Mapping Quality (mapQ, 0-255)
This parameter determines the minimum required mapping quality (mapQ). Mapping quality scores quantify the probability
that a read is misplaced and were introduced by Heng Li and Richard Durbin in 2008. It is related to uniqueness.
The greater the quality distance between the best alignment hit and the second best alignment hit,
the more unique the best alignment, and the higher its mapping quality should be.
The mapping quality should usually be between 0 and 60.
For example, a mapping quality of 10 or less indicates that there is at least a 1 in 10 chance that
the read truly originated elsewhere. A value of 255 indicates that the mapping quality is not available.
For paired-end alignment, the pairing information (distance and strand orientation of the mates) will also be included.
The default setting is 20.

Coverage Filter for Deletions
If this parameter is set, the algorithm prints all deletions where the coverage value
is less than the averaged chromosomal coverage.
If it is not set, all deletions will be reported.

Breakpoint Score Threshold for Deletions (0-100%)
This parameter determines the minimum required breakpoint score of deletions to be
reported. Both breakpoint scores need to exceed this value for a deletion to be reported.
The default setting is 20%.

Circos plots

Include Circos plots for each chromosome
If set, the detected structural variants will be graphically visualized with the open source
drawing software CIRCOS.
For each chromosome, deletions
(blue), insertions (orange), inversions (green) and the average nucleotide coverage
(black) will be shown. For the whole genome, all detected inter-chromosomal
translocations will be visualized in purple. The darker the color of the links the more
read pairs are supporting the predicted structural variant.

Note that this is a time-consuming option, therefore the Circos plot are not included by default.

When the analysis is finished, an email with the
URL of the results will be sent
to the user provided email address.

The results will be available for a limited time on our server.
For details of how long your results will be kept please see the result-email.
After that period they will be deleted unless protected in the project management!

Graphical visualization of the structural variants with the open source drawing software CIRCOS.

The Circos plots show the inter-chromosomal translocations in the whole genome. Chromosome identifiers are shown around the outer ring and are oriented in clockwise orientation. Other tracks (from outside to inside) contain logarithmic average nucleotide coverage (black, 2000000 bin size) and inter-chromosomal translocations (purple, associated with number of read pairs involved in the link).