Choose a topic to get started.

Popular questions

How are the overlap metrics different from one another? Do differences in sample size affect these metrics?

The different overlap measures
incorporate varying amounts of data in the calculation. The reason for
considering different metrics may include differences in the data source as
well as differences in sample size or sampling depth.

Jaccard's index is just a count of
unique rearrangements, shared vs private. Especially for immune sequencing data
where the number of rearrangements is very sensitive to sampling depth,
Jaccard's is not appropriate unless you're using non-quantitative data like
cDNA, or if the overlap is so small to begin with and you're interested in
differences in the few shared sequences. Percent overlap is much more robust to
sampling depth since it includes the fraction of the sample shared in the
calculation. Morisita's Index is similar but also includes a scaling factor for
each of the repertoires so that low-level sharing, which is highly sensitive to
differences in sample size or sampling depth, doesn't affect the metric as
much. Other approaches to addressing sample size differences include down-sampling
to the size of the smallest sample.

Correlation, which is also an optional
output of the overlap tool, is an orthogonal measure. It's a measure of how
similar or dissimilar the frequency of shared clones are. One way to think
about it is, "how similar is the frequencies of clones in one sample to
the frequencies of those same clones in the other sample?"

What are clonality and entropy and how are they different?

Clonality and Entropy are sample diversity metrics. Each is a single number that describes the characteristics of a sample repertoire.

Sample diversity includes two components

Richness – how many unique receptor types are present in the
sample repertoire.

Evenness – the extent to which one or a few receptor
sequences dominate the sample repertoire.

Clonality and Entropy assess diversity in different ways and are both provided in the immunoSEQ Analyzer. The Shannon entropy-based clonality index used in the Analyzer is an evenness metric ranging from 0 to 1. Values approaching 1 indicate a highly clonal repertoire in which a small number of rearrangements comprise a large portion of all immune cells. Conversely values approaching 0 indicate a repertoire where every rearrangement is present at an identical frequency.

One interpretation of clonality is a measure of how focused the immune repertoire is on a set of antigens, where values closer to 1 indicate a more focused immune repertoire. The median clonality for TCRB in healthy adult PBMCs is 0.07. Entropy provides information only on the richness of a sample, and does not indicate the evenness of the sample.

Shannon’s entropy and clonality metrics tend to be inversely correlated; clonality is normalized metric that is robust to sampling depth, however, entropy is not normalized for the number of unique rearrangements sequenced and thus is very sensitive to sampling depth. Given this, entropy should only be used for situations in which sampling depth is controlled. Here are two examples illustrating how entropy can be confounded by sampling depth:

1. Groups of samples being compared have different amounts of DNA input. This affects the number of TCRs sequenced and causes a batch effect that will confound analyses if the entropy metric is used.

2. Groups of tumors being compared have different amounts of T-cell infiltrate. This also affects the number of TCRs sequenced. Unlike the example above, the difference in the number of TCRs sequenced in this case is due to a real biological phenomenon, but it is still a confounding factor when using the entropy metric.

Because it is difficult to distinguish whether differences in entropy are being driven by batch effects or a real biologic phenomenon, we recommend using clonality to quantify repertoire diversity.

From within your Analyzer workspace navigate to your
projects page and click the “New Project” button. Complete the project details
information, select the samples you’d like to include, and then follow the
instructions on the “immuneACCESS” tab. Click the “Save Project” button to
submit your project.

We will review your submission and then publish it on
immuneACCESS so it’s available to everyone.

We identify a rearrangement as productive if it is in-frame and does not contain a stop codon. For the CDR3, the rearrangement must be in-frame relative to the conservedcysteine and phenylalanine even if changes to the N-D-N region no longer code for these amino acids.

In the Analyzer, the data column, "frame_type," identifies each rearrangement as:

What do you mean by “rearrangement” and “template”? Are these sequencing reads? Are they clones?

Rearrangements
and template counts are the basic metrics of immunosequencing.

A
"rearrangement" is a unique nucleotide sequence resulting from V, D,
and J gene rearrangement.A
"template" refers to a single input molecule that acts as a template
for PCR amplification and sequencing.

The
total number of sequencing reads found for each rearrangement is used to infer
the number of input molecules or templates with that sequence. For data from
genomic DNA, each productive template corresponds to a single immune cell, and
each rearrangement from genomic DNA will have a template count of 1 or more,
depending on how many immune cells carry that rearrangement. We avoid the term
"clone" as it is often ambiguous whether a clone refers to a unique
rearrangement or the number of cells carrying that rearrangement.