Abstract

The decrease of TCR diversity with aging has never been studied by direct methods. In this study, we combined high-throughput Illumina sequencing with unique cDNA molecular identifier technology to achieve deep and precisely normalized profiling of TCR β repertoires in 39 healthy donors aged 6–90 y. We demonstrate that TCR β diversity per 106 T cells decreases roughly linearly with age, with significant reduction already apparent by age 40. The percentage of naive T cells showed a strong correlation with measured TCR diversity and decreased linearly up to age 70. Remarkably, the oldest group (average age 82 y) was characterized by a higher percentage of naive CD4+ T cells, lower abundance of expanded clones, and increased TCR diversity compared with the previous age group (average age 62 y), suggesting the influence of age selection and association of these three related parameters with longevity. Interestingly, cross-analysis of individual TCR β repertoires revealed a set >10,000 of the most representative public TCR β clonotypes, whose abundance among the top 100,000 clones correlated with TCR diversity and decreased with aging.

Introduction

In humans, aging is associated with prominent decline of the adaptive immune response, caused by both altered functionality of aged T cells (1, 2) and overall decrease in naive T cell abundance and TCR diversity. The latter results from thymus involution (3, 4); stochasticity, selectivity, and exhaustion of peripheral proliferation of naive T cells (5–9); and expansion of effector memory clones that take over the homeostatic space (6, 10–13).

It is well acknowledged that high TCR repertoire diversity is a necessary prerequisite for an effective adaptive immune response against new Ags (6). Loss of naive T cells and constriction of the TCR repertoire result in impaired immunity to viral and bacterial infections (6, 14–16), poor response to vaccination (17, 18), poor immune system recovery after chemotherapy and hematopoietic stem cell transplantation (19–21), and poor control of autoreactive T clones and autoimmunity (22–24).

The phenomenon of aging-associated TCR repertoire constriction, although widely accepted, has never been studied by direct methods. Oligonucleotide hybridization assays (25) and spectratyping analysis (26) have suggested that early and midadulthood are not associated with significant contraction of the TCR repertoire and that this occurs only after the age of 75 (25, 27).

In this study, we have used an advanced deep TCR β sequencing approach coupled with a sample normalization strategy that allowed us to directly quantify and compare T cell repertoire diversity across samples obtained from numerous individuals of different ages. We achieved deep and accurate profiling of individual TCR β CDR3 repertoires by performing unique barcoding of cDNA molecules (28), followed by paired-end Illumina sequencing and rational bioinformatic analysis of the output data for precise sample normalization (29, 30). This approach enabled us to track age-related changes in the human TCR repertoire with unprecedented accuracy, revealing key parameters of adaptive immunity aging and providing high-quality reference data for future studies of adaptive immunity in health and disease.

Materials and Methods

Sample collection

This study was approved by the local ethics committee and conducted in accordance with the Declaration of Helsinki. All donors were informed of the final use of their blood and signed an informed consent. Ten milliliters of peripheral blood was obtained from 39 systemically healthy Caucasian donors from Russia, aged 6–90 y old (Table I). Peripheral blood was collected into EDTA-treated Vacutainer tubes (BD Biosciences, Franklin Lakes, NJ). PBMCs (at least 7 × 106/sample) were isolated by Ficoll-Paque (Paneco, Russia) density-gradient centrifugation. Total RNA (at least 6 μg/sample) was isolated using Trizol reagent (Invitrogen, Carlsbad, CA), according to the manufacturer’s protocol.

cDNA synthesis

First-strand cDNA was synthesized using the Mint cDNA synthesis kit (Evrogen, Russia), according to the manufacturer’s protocol (Supplemental Fig. 1). RNA (1.5 μg) was collected per 15-μl reaction volume, with a total of four to six tubes per sample, such that all RNA was used for each blood sample. Synthesis was primed with the BC_R4_short oligonucleotide (5′-GTATCTGGAGTCATTGA-3′), which is specific to both gene variants of the human TRBC segment. To denature RNA and anneal the priming oligonucleotide, the RNA was incubated at 70°C for 2 min and then at 42°C for 2 min. The SmartNNNNa 5′-adaptor (5′-AAGCAGUGGTAUCAACGCAGAGUNNNNUNNNNUNNNNUCTTrGrGrGrG-3′), which carries a molecular identifier (12 random “N” nucleotides) and dU nucleotides (U), was added for the template switch. This reaction was carried out at 42°C for 130 min, with 5 μl IP solution added after the first 40 min. Products of cDNA synthesis were treated with fresh uracil-DNA glycosylase (New England Biolabs, Ipswich, MA) to degrade the SmartNNNNa adaptor (5 U/15 μl reaction) at 37°C for 30 min. To capture the maximum number of input cDNA molecules, the whole volume of the treated cDNA synthesis reaction was used for the first PCR.

First PCR

The first PCR amplification was performed using Encyclo PCR mix (Evrogen) with a pair of universal primers: M1SS (5′-AAGCAGTGGTATCAACGCA-3′) and BC2_uni_R (5′-TGCTTCTGATGGCTCAAACAC-3′). The PCR mixture contained 1× Encyclo polymerase buffer, 0.125 mM of each 2′-deoxynucleoside 5′-triphosphate, 10 pmol of each primer, 1 μl Encyclo polymerase mix, and 3 μl of undiluted first-strand cDNA/50 μl reaction volume. The reaction was performed on an ABI 9700 Thermal Cycler with a gold block in multiple 0.2 ml PCR tubes, such that all cDNA was used (up to 32 tubes/sample), for 18 cycles with the following temperature regimen: 94°C for 10 s, 62°C for 20 s, and 72°C for 30 s.

Second PCR

Sample products from the first PCR amplification were combined into one tube and mixed. A 100-μl aliquot of undiluted PCR product was purified by QIAquick PCR purification kit (Qiagen) and eluted in 20 μl TE buffer. For the second PCR amplification, we used the M1S primer ((N)2–4(XXXXX)CAGTGGTATCAACGCAGAG; XXXXX is a sample barcode, and (N)2–4 are random nucleotides that generate diversity for better cluster identification on Illumina sequencer) on the 5′-end of the library and J-β primer mix (31) on the 3′-end. 1 μl of purified product from the first PCR was used per 50 μl PCR, with four tubes for each sample, for 7–11 cycles with the following temperature regimen: 94°C for 10 s, 62°C for 20 s, and 72°C for 30 s.

Sequencing

The PCR product concentration in each library was determined using a QuBit fluorometer (Invitrogen). PCR products of 9–20 samples from different donors were mixed together in an equal ratio. To avoid experimental bias, donors of different age were randomly distributed across the three lanes. Illumina adapters were ligated according to the manufacturer’s protocol. The libraries were analyzed using three Illumina HiSEquation 2000 lanes, with 100 + 100-nt paired-end sequencing with Illumina primers.

Raw sequencing data analysis

Preliminary analysis showed that at least 3 × 106 CDR3-containing paired-end sequencing reads were obtained for each sample. From those, we randomly selected 1 × 106 high-quality sequencing reads so that only one read was present with a given unique molecular identifier. No binning of CDR3 sequences with identical molecular identifiers was performed, and reads with duplicate molecular identifiers were filtered out. Thus, for each donor we obtained sequences of 1 × 106 unique TCR β cDNA molecules. To check for possible effects of this random selection, we performed a repetitive sampling of molecular identifiers that yielded no change in our overall results, including sample diversity (data not shown). Further analysis (final CDR3 identification, clonotype clusterization, and correction for reverse transcription, PCR, and sequencing errors) was performed as previously described (29) using our MiTCR software (30) (http://mitcr.milaboratory.com/). The threshold on sequencing quality for each nucleotide within the CDR3 region was set as > Q25 (phred). To eliminate a maximal number of accumulated errors, we employed the strictest “eliminate these errors” correction algorithm, which eliminates 98% of artificial CDR3 clonotypes, including variants that arise from rare errors in minor clonotypes. At the same time, this algorithm loses <2% of input TCR β diversity, according to control experiments and in silico modeling.

Lower-bound estimate of total TCR β CDR3 diversity in blood

The total number of unique TCR β clonotypes was estimated as in Ref 32, using a nonparametric unseen species model as described previously (33). Briefly, the Poisson process assumption, which relates the expected number of times a species is going to appear in the sample (xs) and the sampling depth (t), leads to an estimator for total number of species in the form of an infinite series: where nx is the number of species found exactly x times. Unfortunately, these series begin to severely oscillate with increasing sampling depth, which renders it impossible to estimate total diversity. This problem can be dealt with using Euler transformation, which forces series to converge rapidly. We therefore calculated the first x0 terms in modified series, where x0 is the largest number that yields a coefficient of variation for estimating S(t) that is <0.1. This provides a tradeoff between bias and variance, as suggested by Efron et al. (33).

Abs and flow cytometry

An aliquot of PBMC from each sample was stained with Abs for flow cytometry analysis. The following anti-human Abs were used: CD3-PC7 (clone UCHT1; eBioscience), CD27-PC5 (clone 1A4CD27; Invitrogen), CD4-PE (clone 13B8.2; Beckman Coulter), CD45RA FITC (clone JS-B3; eBioscience). Cells were incubated with Abs for 20 min at room temperature and washed twice with PBS. Flow cytometry was performed with the Cytomics FC 500 (Beckman Coulter), with data analysis carried out using the Cytomics RXP Analysis program (Beckman Coulter).

Results

Normalization of samples using unique molecular identifiers

Blood samples from different donors—even sequential blood samples from the same donor—may contain different numbers of T cells. Furthermore, none of the TCR gene library preparation and sequencing procedures can be made absolutely uniform, regardless of whether genomic DNA-based (32, 34) or cDNA-based (29, 31, 35) technique is used. Therefore, one can never be sure of the quantity of T cells that are efficiently covered by the output of sequencing analysis of large TCR libraries. This makes it impossible to perform accurate and reproducible deep sequencing-based comparison of relative TCR diversity in blood samples obtained from different donors or at different times.

To overcome this basic limitation, we combined a cDNA-based protocol for preparing quantitative TCR β libraries (31, 35) with a unique molecular barcoding technique similar to that reported in Ref 28. In this technique, the template-switching effect (36, 37) is used to introduce a 5′-adaptor that carries 12 random nucleotides. As a result, each synthesized cDNA molecule is specifically labeled with one of 412 (>16.7 million) unique identifier variants. Such molecular identifiers allow robust estimation of the number of cDNA templates in a deeply sequenced library, which cannot be accurately deduced from read data alone (38). The whole TCR β library was further amplified using universal primers (see Materials and Methods and Refs. 29 and 31) and analyzed using paired-end Illumina sequencing, which provided information about both the TCR β CDR3 sequence and the unique molecular identifier of the starting cDNA molecule (Supplemental Fig. 1).

This method allows normalization of any two or more libraries, even if they were obtained from different numbers of T cells and with different sequencing coverage, by analyzing equal numbers of unique cDNA molecules labeled with different molecular identifiers. Because each read represents a distinct cDNA molecule, such normalization eliminates systematic library preparation biases, resulting in a prominent reduction of variance between samples and a notable increase in detectable TCR β diversity per fixed number of analyzed sequencing reads (Fig. 1). This normalization also reduces biases within the sample; because of the stochastic and biased nature of PCR and sequencing, cDNA variants with the same abundance may have sequencing read counts that differ by severalfold. Sampling equal numbers of reads will therefore lead to some variants being overrepresented and some lost, especially the low-abundance variants that primarily contribute to sample diversity. In contrast, extraction of sequences with unique cDNA identifiers allows uniform sampling and accurate comparison of the samples.

Model experiments demonstrated that the average efficiency of the template-switch technique that we used is about one molecular event per two T cells. For example, deep sequencing analysis of libraries prepared from ∼2 × 104 or 1 × 106 T cells produced output data comprising ∼1 × 104 or 5 × 105 uniquely barcoded variants, respectively. Therefore, each analyzed TCR β cDNA labeled with a unique molecular identifier is equivalent to a single randomly captured T cell, and 1 million such molecular events is equivalent to 1 million T cells randomly captured from a 10 ml peripheral blood sample.

This powerful tool for normalizing analyzed sequencing datasets enabled us to perform deep and accurate comparison of TCR β CDR3 diversity between samples with sufficient but not necessarily equal numbers of T cells or levels of sequencing coverage. Furthermore, this approach minimized the impact of individual donor blood characteristics or unavoidable bottlenecks and biases during blood sampling, library preparation, and sequencing.

For each donor, a molecular-barcoded cDNA-based TCR β library was generated and analyzed by deep paired-end Illumina sequencing. We processed the output for normalization via unique molecular identifiers, with further clonotype clustering and error correction using MiTCR software (30) as described in Materials and Methods. The resulting normalized datasets, which are available at http://mitcr.milaboratory.com/datasets/, contained full quantitative information on the clonal composition and diversity of TCR β CDR3 repertoires per 106 T cells from each donor.

The observed diversity per 106 T cells decreased almost linearly with age (R = −0.75, p = 4 × 10−8), ranging from 7.8 × 105 clonotypes for the youngest down to 1.3 × 105 clonotypes for one of the aged donors. We observed a significant decrease in the number of unique TCR β clonotypes between young and middle-age (p < 0.01, 1.3-fold) and between middle-age and aged (p < 0.01, 1.6-fold) groups. However, a nonsignificant increase in TCR diversity between the aged and long-lived groups (Fig. 2A, Table I) was observed.

Age-dependent trends in TCR repertoire diversity. (A) The number of unique TCR β CDR3 clonotypes per 106 T cells is shown as a function of donor age. Observed diversity declines roughly linearly with age (R = −0.75). The trend is significant (p < 0.0001; Kruskal–Wallis test). **p < 0.01; two-tailed t test. (B) Species accumulation curves for TCR β clonotypes for all four age groups. We analyzed subsets of a sample of 106 cDNA molecules in increments of 1 × 105. Plots show the average number of unique clonotypes obtained from random samplings (n = 5) of cDNA molecules from each donor. (C) Estimates of the lower bound of total TCR β diversity in human donors. See Materials and Methods for details.

Notably, analysis of 1 × 106 randomly chosen CDR3-containing sequencing reads without using molecular identifiers distinguished poorly between different age cohorts (p > 0.05, two-tailed t test for age groups 1 and 2). This is most likely because group 1 is characterized by the most diverse repertoire, whereas read-based data generally underestimate sample diversity (see Fig. 1, Supplemental Fig. 2A and below).

We analyzed absolute T cell counts in blood for selected donors, and observed no significant age-related changes (data not shown), in agreement with previous data showing either no significant change or only a marginal decrease in either absolute count or proportion of CD3+ cells with age (39–41). Thus, we believe that the observed changes in the TCR β repertoire generally refer both to the diversity per 1 million T cells and to the absolute T cell counts, albeit with a possible bias in the latter due to individual variance in T cell counts in the blood.

Estimating the lower bound for total individual TCR β CDR3 diversity

Analysis of species accumulation curves (Fig. 2B) showed that the number of TCR β clonotypes detected in each donor sample increased proportionally with the number of cDNA molecules analyzed, with an almost direct relation. The species accumulation curves were similar within age-groups but differed between them. For young individuals, each additional 100,000 cDNA molecules analyzed consistently yielded ∼60,000 additional TCR β CDR3 clonotype variants. In general, most curves remained far from saturation, confounding estimation of total TCR β diversity.

To estimate the lower bound of total TCR β diversity for the studied donors, we applied an unseen species model (see Materials and Methods). For young cohort (group 1), the estimated lower bound on total TCR β diversity was ∼7 × 106 different clonotypes, and this estimate declined to ∼4 × 106 for middle-aged individuals. In group 3, the lower bound was ∼2.4 × 106, significantly lower than that of group 2 (1.6-fold, p = 0.001) (Fig. 2C).

It should be noted that this is only the lower bound estimate, which would be predicted to increase with greater sampling depth. To verify this, we further analyzed two independent replicas consisting of 1 × 106 unique cDNA reads each from two donors from our study, aged 25 and 87, whose blood samples contained 44 and 15% naive T cells, respectively (starting from the same blood draw, cells separated at the level of purified PBMC). These additional replicas, which were added to the full analysis pipeline starting from joint reads, increased the directly observed diversity 1.91- and 1.74-fold, and increased the lower-bound estimate for total TCR β diversity 2.3 ± 0.4- and 2.4 ± 0.1-fold for young and long-lived donors, respectively, confirming that determination of the lower bound of total TCR β diversity in an individual based on a sample of 1 × 106 sequenced T cells is still essentially an underestimate.

Expanding clones occupy homeostatic space

Expanded TCR β clonotypes—defined as those that constituted >0.001, >0.01, or > 1% of 106 analyzed T cells—consistently occupied significantly greater homeostatic space in the peripheral blood in an age-dependent fashion when we compared young, middle-aged and aged donors (groups 1, 2, and 3). The percentage of T cells represented by single-cell clonotypes decreased from ∼55% down to 24% in these cohorts, with significant inverse correlation with age. Notably, these trends did not extend to group 4, which was characterized by a lower abundance of expanded clonotypes and higher prevalence of low-frequency clonotypes compared with group 3 (Fig. 3).

Clonal space homeostasis. The percent of clonal space occupied by clones of a given type (classified by size) is provided for the four age groups. The correlation of the percentage occupied by clones of a given type with age is indicated. ***p < 0.0001, **p = 0.001.

To estimate the correlation between relative abundance of naive T cells, donor age, and observed diversity of the TCR β repertoire, we measured percentages of naive CD45RAhigh/CD27high T cells (42) in blood samples from the same 39 donors using flow cytometry (Fig. 4A).

Aging and naive T cells. (A) Flow cytometric gating of naive T cells. PBMCs were analyzed by multicolor flow cytometry. T cells, as defined by CD3 positivity and scatter characteristics were further stained for CD45RA and CD27. Representative flow cytometric analyses of CD3-positive gates for donors from groups 1–4 are shown. Numbers indicate percentages of naive CD45RAhigh/CD27high T cells of all CD3-positive T cells. (B and C) Age-dependent trends in TCR repertoire diversity and naive T cell abundance. (B) Percentage of naive CD45RAhigh/CD27high T cells among total CD3+ cells plotted versus age. The trend is linear and statistically significant (R = 0.85, p = 1 × 10−8) for groups 1–3 (●) but does not extend to group 4 (○). **p = 0.0012. (C) The percentage of naive T cells in CD8+ (○, dashed line) and CD4+ (●, solid lines for groups 1–3 and for the average of group 4) subsets. The percentage of naive cells in the CD8+ subset declines linearly with age (R = −0.88). However, such a decline is observed only in groups 1–3 for the CD4+ subset (R = −0.75). A significantly greater percentage of naive T cells is observed in the CD4+ subset relative to the CD8+ subset in group 4. **p = 0.004; paired t test.

As expected, the percentage of naive T cells generally decreased with age. Performing linear regression over all four age groups yielded a correlation coefficient of R = −0.68. This, however, was greatly improved (to R = −0.85) by dropping group 4, which was characterized by a markedly higher percentage of naive T cells compared with group 3. Group 3 was also characterized by a significantly lower percentage of naive T cells compared with group 2 (p = 0.0012), whereas the difference between group 2 and group 4 was insignificant (p = 0.13) (Fig. 4B).

Notably, the observed TCR β diversity per 1 × 106 T cells reflects age-dependent changes better than the percentage of naive T cells. When we computed the partial correlation (43) between age and observed sample diversity while controlling for naive T cell percentage, we obtained a significant R = −0.47 (p = 0.003). In contrast, we obtained a nonsignificant R = −0.03 (p = 0.86) for the partial correlation between age and naive T cell percentage while controlling for observed sample diversity.

Long-lived donors have high percentages of naive T cells in their CD4+ subset

Separate flow cytometry analysis of CD4+ and CD8+ subpopulations to determine percentages of naive CD45RAhigh/CD27high T cells (gating identical to Fig. 4A) revealed that the percentage of naive CD8+ T cells decreased linearly (R = −0.88, p < 1 × 10−8) with age. The percentages of naive CD4+ T cells also decreased linearly (R = −0.75) and with similar kinetics up to the age of 70 y. However, the long-lived donors of group 4 exhibited a 1.7-fold increased percentage of naive T cells in the CD4+ subset compared with group 3, a difference that was close to significant (p = 0.054). Moreover, in contrast to other age groups, we observed a significantly higher percentage of naive T cells in the CD4+ subset compared with the CD8+ subset in Group 4 (3.6-fold, p = 0.004) (Fig. 4C). Higher percentages of naive CD4+ T cells correlate with the prevalence of low-frequency clonotypes (Fig. 3) and relatively high TCR β diversity (Fig. 2A, Table I) in group 4.

Age-dependent changes in relative TRBV and TRBJ usage frequency

We observed only subtle age-related trends for usage of most V and J gene segments. Still, there were several gene segments for which the frequency among clonotypes showed clear correlation with age (Supplemental Fig. 3). This tendency could be partially explained by the changes in the CD4/CD8 T cell ratio. In agreement with previous studies (44), we observed that the CD4/CD8 ratio gradually increases with age (data not shown). Because TRBV7-9 and TRBV13 were recently shown to be more characteristic of CD8+ T cells (45), a decrease in relative abundance of CD8+ T cells should result in lower usage of these TCR V β gene segments, and this is in accordance with our observations. Conversely, TRBV18 is more characteristic of CD4+ T cells (45), and we likewise observed increasing TRBV18 usage frequency with age (Supplemental Fig. 3). We did not detect differences between age groups in any other bulk parameters, such as average CDR3 length or number of added nucleotides.

Public clonotypes and aging

In recent years, it has become clear that public TCR sequences that are identical or near-identical across different individuals are prominently represented in the blood, as a result of convergent recombination and recombinatorial biases (46–51). In agreement with these observations, cross-analysis of normalized TCR β datasets for any two donors from our study revealed essential overlaps that constituted between 8,000 and 55,000 CDR3 variants with identical amino acid sequences and between 500 and 7,000 CDR3 variants with identical nucleotide sequences.

To obtain a list of the most common public TCR β clonotypes (representative public clonotypes, repPC), we cross-analyzed the top 100,000 clonotypes from each donor sample and selected those amino acid CDR3 variants that were detected in at least 6 of 39 donors (15% of donors). This yielded a repPC list of >10,000 clonotypes, which is available as Supplemental Table I. As expected (47, 52), repPC featured extremely short CDR3 sequences with a median length of 39 nt, compared with the median of 45 nt from all analyzed clonotypes from all of our donors, as well as a smaller number of added nucleotides (Fig. 6A, 6B). Although the J segment usage of repPC was indistinguishable from the average, there were several notable changes in V segment usage; TRBV12-4, TRBV5-1, TRBV7-2, and TRBV7-9 segments were overrepresented, whereas TRBV20-1 and TRBV29-1 segments were less represented in repPCs compared with the average usage among the top 100,000 clonotypes from our 39 donors (Fig. 6C). Interestingly, the number of repPC clonotypes observed among the top 100,000 clonotypes decreased with age across groups 1–3 (R = 0.58, p = 1 × 10−3; Fig. 6D) and correlated with the observed diversity of TCR β repertoire (Fig. 6E).

Characteristics of representative public clonotypes. (A and B) Distribution of CDR3 length (A) and added nucleotides (B) for the top 100,000 clonotypes from all 39 donors (gray line) compared with that of the 10,692 repPC clonotypes (black line). (C) Bias in average VJ pairing frequencies in public clonotypes was computed by subtracting the average pairing frequencies for the top 100,000 clonotypes in each of our 39 donors from that of the repPC clonotypes. (D) Age-dependent trends in the number of repPC clonotypes found among the top 100,000 clonotypes of each donor. (E) Correlation of the number of repPC clonotypes among the top 100,000 clonotypes with the observed TCR β diversity. R (age-independent) is the partial correlation of diversity and number of repPC clonotypes calculated while controlling for age variable. ***p < 0.001, **p < 0.01.

In silico CDR3 spectratyping vividly demonstrates that expanded clonotypes can skew the spectratype to various extents, such that they are in some cases essentially hidden (Fig. 7). As such, extrapolating spectratype data to whole repertoires, although a powerful and widely used approach (27, 53, 54), can lead to inaccurate estimations. In contrast, the direct identification of hundreds of thousands or millions of unique TCR β clonotypes using deep normalized profiling followed by strict correction of PCR and sequencing errors provides full information on clonal composition and relative sample diversity. This combination of technological advances takes our ability to analyze and understand adaptive immunity to a new level.

TRBV7 CDR3 in silico spectratyping. Representative spectratypes for donors from groups 1 and 4 are shown. Those clonotypes that occupy >0.1% of all T cells in a sample are shown in color.

We summarize and interpret our results as follows:

1. In human peripheral blood, the percentage of naive T cells in both CD4+ and CD8+ subsets decreases linearly with age, starting from ∼50 to 80% in childhood, with kinetics approximately equal to 0.75% of total homeostatic space lost per year. The percentage of naive T cells out of all T cells is decreased 4-fold by the age of 70 (Fig. 4B, 4C).

2. Observed TCR β diversity per 106 randomly-captured human peripheral blood T cells correlates with the percentage of naive T cells (Fig. 5) and decreases roughly linearly with age, starting from ∼6 × 105 clonotypes detected per 106 T cells in childhood, with kinetics of ∼5 × 103 fewer observed clonotypes per year. The decrease in observed TCR β diversity is significant by the age of 40 (Fig. 2A), and generally reflects age-dependent changes even better than the percentage of naive T cells. Given that the blood count of T cells remains generally stable with age (39–41), the observed diversity of TCR β variants per 106 T cells can be considered as a measure of “efficient T cell diversity” of an individual, i.e., the probability of a successful meeting between a (naive) T cell and its target Ag.

3. It is very challenging to estimate or measure total TCR β diversity correctly due to the high heterogeneity in the frequency of naive T cells. During homeostatic proliferation, some naive T cells divide more efficiently, and some are gradually lost (5, 7, 8). Thus, at a given moment, a naive T cell clone can be ultimately represented by a single cell. Formally, such single-cell clones still add to the total TCR diversity, although this is probably physiologically nonsignificant. Here we have achieved a record depth of individual TCR β sequencing, but still it only represents up to two million T cells from a pool of almost 1012 T cells. Thus it is mathematically impossible to extrapolate sequencing data to accurately measure true total TCR β diversity, and here we have only discussed the lower-bound estimate at the achieved depth of profiling, obtained using the method by Efron et al. (33). At a depth of 1 million analyzed T cells (unique cDNA), this estimate constitutes at least 7 × 106 TCR β variants in childhood, decreasing linearly to an average of ∼2.5 × 106 variants by old age (Fig. 2C). However, this essentially remains an underestimation, as the analysis of an additional one million T cell replicas from two donors of different age increased the lower-bound estimate for total TCR β diversity ∼2.3- to 2.4-fold, reaching at least 9 million variants in the 25-y-old donor. This estimate will probably increase further with greater sequencing depth. The degree to which the lower-bound estimate increased was consistent between the two donors of different ages, suggesting that such estimate could be potentially employed for a rough comparative analysis of relative T cell diversity in individuals. However, only extradeep sequencing of sorted naive T cells will help to accurately answer the question of total naive TCR β diversity and its changes over the course of the aging process, since bulk approaches imply that different numbers of naive T cells are being analyzed from donors of different ages, which largely determines the observed diversity.

4. Expanded clones occupy increasing amounts of homeostatic space with age (Fig. 3), as described previously (10). This is detrimental to the adaptive immune response, as less space remains for naive T cells, given that the total blood count of T cells is generally stable. Overall diversity is mostly determined by the number of rare clonotypes, and we did not observe significant age-related changes in the diversity of non-rare clonotypes (those with more than one T cell per million, R = 0.22, p = 0.2). Cumulatively, these data may indicate that naive T cell clones are being lost with age, and that existing Ag-experienced T cell clones become expanded to a greater extent, whereas relatively minor numbers of new Ag-experienced clones emerge. However, in the absence of analysis of purified, phenotypically defined T cell subsets, it is difficult to conclusively compare homeostatic changes within and between compartments. It should be also noted that the CMV infection status of the individuals in our study is not known, which could be important due to the known influence of CMV upon absolute numbers of highly-differentiated T cells (55, 56). Moreover, because Abs to CMV are detected in a majority of donors over the age of 40 (57), we expect that we would not be able to perform accurate comparative analysis even with this data, as only a small number of elderly donors would be CMV-negative. Further studies using deep profiling of TCR repertoires with sorted naive and Ag-experienced populations of T cells from sufficient cohorts of donors of different ages with known CMV status should reveal the exact changes in the Ag-experienced and naive subsets associated with aging. We believe that normalization approach described in this study, based on the use of unique molecular identifiers, should greatly facilitate such studies.

5. Increased percentages of naive T cells in the CD4+ T cell subset, higher prevalence of low frequency clonotypes, and slightly higher TCR β diversity are observed in donors of group 4 (average age 82 y) compared with group 3 (Figs. 2–4, Table I). A plausible explanation is that since the “long-lived” individuals in group 4 have passed a certain threshold of age selection, their physiological parameters might differ noticeably from the general population. This implies that we are observing the effects of selection bias and a possible association of increased percentages of naive T cells in the CD4+ subset with longevity. Such an association can be explained by the improved immunoregulatory function of a more diverse CD4+ T cell subset in these individuals, which restrains the generalized inflammation that typically increases with age (58), and provides a better-balanced immune response. Our explanation differs from that of Ferrando-Martinez and colleagues (59), who suggested that the dysfunctional aged thymus could be generally biased toward naive CD4 T cell production. It is clear from Fig. 4C that, before the age of 70, percentages of naive T cells decrease with similar kinetics in CD4+ and CD8+ subsets. Therefore, we believe that it is age-selection rather than general trends that determines the relatively high percentage of naive T cells within the CD4+ subset that we observed in our oldest donors.

6. Interindividual TCR sharing in humans is an important phenomenon and is thought to play an important role in the efficacy of Ag responses (49–51, 60). In this study, we have extracted a list of the 10,692 most representative public TCR β clonotypes that are widely shared (present among the top 100,000 expanded clonotypes) among at least 6 of our 39 donors (i.e., ≥15%). These repPC clonotypes display CDR3 properties suggesting closeness to germline: short CDR3 length (median length of 39 nt, compared with 45 nt for all clonotypes) and low numbers of added nucleotides (Fig. 6A, 6B, Supplemental Table I). Notably, the number of RepPC clonotypes decreased with age, displaying a pattern similar to that observed for percentage of naive T cells and sample diversity (Fig. 6E, 6F). One possible explanation is that a sizeable portion of high-frequency public clonotypes is represented by naive T cells, whose TCR β CDR3 sequences are characterized by low complexity (i.e., small numbers of added nucleotides) and that are therefore repetitively produced and highly represented due to convergent recombination (46, 50, 52).

Disclosures

The authors have no financial conflicts of interest.

Footnotes

O.V.B., E.V.P., M.A.T., D.B.S., E.M.M., E.A.B., and I.Z.M. isolated PBMC, generated TCR β libraries, and performed flow cytometry analysis; M.S. and D.A.B. performed data analysis; S.L. and Y.B.L. cosupervised the project; and D.M.C. supervised the project, designed, and interpreted all experiments and wrote the paper.

This work was supported by the Molecular and Cell Biology Program from the Russian Academy of Science, Russian Foundation for Basic Research Grants 12-04-33139, 12-04-00229 (to D.M.C.), and 14-04-01247 (to E.M.M.), and European Regional Development Fund CZ.1.05/1.1.00/02.0068.