Appendix 4. Clinical Grade Scoring Rules

Clinical Grade Methodology

The Clinical Grade is a measure of a variant file’s overall quality and fitness for clinical interpretation. The current score methodology applies to sequences run through Opal Annotation Engine 4.1 and greater. For sequences annotated with versions of the pipeline prior to 4.1, see this section below.

The Clinical Grade is based on four metrics:

Median read coverage per coding variant

Percent of coding SNVs ≥ Q40

Transition/Transversion (Ti/Tv) ratio for coding variants

Homozygous/Heterozygous ratio for coding variants

The metrics give an overall indication of the quality of sequencing, assembly, and variant calling. These metrics were selected and optimized based on the biological and statistical structure of nucleotide content in exomes and genomes, and are applicable to human germline DNA. These metrics were calculated from a dataset of 1154 exomes and 59 genomes.

A low clinical grade score signals irregularities. Such irregularities might be due to:

Assigning the Clinical Grade score

The Clinical Grade score is represented by a signal-strength bar icon. One bar is earned when any of the following criteria are met. A maximum of four bars can be earned.

The median coverage of coding variants exceeds 40 (Genomes) or 50 (Exomes or Other, see definition of Other in next section)

At least 95% of the coding variants have a quality above 40

The Homozygous/Heterozygous ratio for the coding variants is within the green range

The Transition/Transversion ratio for the coding variants is within the green range.

For the Homozygous/Heterozygous and Transition/Transversion ratios, a color is assigned as follows:

Green: values that fall within the 10th and 90th percentile

Yellow: values outside of the green range that fall between the 5th and 95th percentile

Red: all other values

Exomes and Whole Genomes

Opal categorizes genome variant files into three classes:

Full genomes: genomes that have between 2.5M and 6M SNVs (noted as a "G” to the right of the signal strength icon)

Exomes: genomes that have between 10K and 110K SNVs (noted as an "E” to the right of the signal strength icon)

Other: genomes that fall outside the two previous ranges (noted as an "O” to the right of the signal strength icon)

The Clinical Grade is calculated using coding variants only. However, for genomes, metrics for all variants are provided in the second tab in the Clinical Grade dialog box.

Coverage

This metric calculates the median number of reads per variant in your variant file. Low values indicate poor coverage across the genome. Opal displays information about the distribution of coverage values in your sample using a box plot. The inter quartile range is shown by a rectangle, with the median marked as a vertical bar within this. The whiskers show the minimum and maximum of your data excluding outliers. The median is compared to the expected read depth of a clinical quality whole genome or exome. Passing values are set by an analysis of literature and current practices.

Quality

This metric calculates the percentage of single nucleotide variants (SNVs) present in your variant file that have a quality score greater or equal to 40 (“Q40”). A quality score of Q40 translates into an error rate of 1/10,000. Low scores indicate issues with sequencing and/or variant calling.

Opal displays information about the distribution of quality values in your sample using a box plot (see above). Passing values are set by an analysis of literature and current practices.

Homozygous/Heterozygous Ratio

The ratio is compared to the distribution of homozygous variant to heterozygous variant ratios in the Opal reference dataset.

Transition/Transversion (Ti/Tv) Ratio

This metric calculates the transition to transversion ratio (“Ti/Tv” ratio) for single nucleotide variants (SNVs) present in your variant file. Transitions occur when a purine (A,G) is substituted with another purine or a pyrimidine (C,T) is substituted with another pyrimidine, while a transversion occurs when a purine is substituted with a pyrimidine, or vice-versa. Abnormal ratios indicate issues with sequencing and/or variant calling.

The ratio is compared to the distribution of Ti/Tv ratios in the Opal reference dataset.

The Clinical Grade is a measure of a variant file’s overall quality and fitness for clinical interpretation. The score is based on four metrics:

Median read coverage per variant

Percent of SNVs ≥ Q40

Transition/Transversion (Ti/Tv) ratio

Homozygous/Heterozygous ratio

The metrics give an overall indication of the quality of sequencing, assembly and variant calling. These metrics were selected and optimized based on the biological and statistical structure of nucleotide content in exomes and genomes, and are applicable to human germline DNA. These metrics were calculated from a dataset of 296 exomes and 214 genomes.

A low clinical grade score signals irregularities. Such irregularities might be due to:

Assigning the Clinical Grade Score

“Moderate”: values that fall within two standard deviations of the median

“Poor”: values that fall outside.

The score is represented by signal-strength bar icon in the interface. Each metric that falls within the “good” range adds one bar to the signal bar. Four bars mean that all metrics fall in the “good” range.

Exomes and Whole Genomes

Exomes and whole genomes are assigned scores based on different ranges for each metric. Exomes are typically sequenced with higher read depth and a higher quality is expected. Also, the exons are highly conserved and the Ti/Tv and hom/het ratios have different statistics.

Opal categorizes genome variant files into three classes:

Full genomes: genomes that have between 2.5M and 5M SNVs (noted as a "G” to the right of the signal strength icon)

Exomes: genomes that have between 10K and 87K SNVs (noted as an "E” to the right of the signal strength icon)

Other: genomes that fall outside the two previous ranges (noted as an "O” to the right of the signal strength icon)

If a genome falls outside the Full Genome or Exome SNV ranges, it is not assigned a clinical grade score.

Percent of SNVs ≥ Q40

This metric calculates the percentage of single nucleotide variants (SNVs) present in your variant file that have a quality score greater or equal to 40 (“Q40”). A quality score of Q40 translates into an error rate of 1/10,000. Low scores indicate issues with sequencing and/or variant calling.

The value is compared to the distribution of variant quality across the Opal reference dataset. Passing values (green in band) are those within one standard deviation of the median.

Homozygous/Heterozygous Ratio

The ratio is compared to the distribution of homozygous variant to heterozygous variant ratios in the Opal reference dataset. Passing values (in green band) are those within one standard deviation of our median.

Median Read Coverage

This metric calculates the median number of reads per variant in your variant file. Low values indicate poor coverage across the genome.

The median is compared to the expected read depth of a clinical quality whole genome or exome. Passing values (in green band) are set by an analysis of literature and current practices.

Transition/Transversion (Ti/Tv) Ratio

This metric calculates the transition to transversion ratio (“Ti/Tv” ratio) for single nucleotide variants (SNVs) present in your variant file. Transitions occur when a purine (A,G) is substituted with another purine or a pyrimidine (C,T) is substituted with another pyrimidine, while a transversion occurs when a purine is substituted with a pyrimidine, or vice-versa. Abnormal ratios indicate issues with sequencing and/or variant calling.

The ratio is compared to the distribution of Ti/Tv ratios in the Opal reference dataset. Passing values (in green band) are those within one standard deviation of the median.