Analysis of HLA Frequencies in Population Cohortsfor Design of HLA-Based HIV Vaccines

Frances E. Ward, Shiangtai Tuan, and Barton F. Haynes

Departments of Immunology, Information Technology (OIT), and Medicine, and the Duke Center for AIDS Research, Duke University Medical Center, Durham, North Carolina 27710

As HIV infection and AIDS continue to spread around the world, the need to develop protective HIV
vaccines is urgent. However, the most effective approaches to be taken remain uncertain. Recently,
evidence for protective immunity to HIV infection has been suggested from the observation of cellular and humoral anti-HIV immune responses in HIV-infected individuals who have not developed AIDS after about 10 years1. While the immune mechanisms that lead to protection from developing AIDS are incompletely understood, recent findings suggest that both cytotoxic (CTL) and helper T (Th) cells may play important roles in anti-HIV protective immunity2, 3. Moreover, HLA genes or genes linked to or associated with the HLA complex have been suggested to predispose to either rapid progressor (AIDS within approximately 3 years of infection) versus non-progressor status (no progression to AIDS after 10 years) [rev in 4].

HLA genes present peptides to cytotoxic and helper T cells in a restricted manner; HLA-A, -B, and -C class I alleles generally present immunogenic peptides to CTL and HLA class II alleles present immunogenic peptides to Th cells [rev in 5]. These data have given rise to the notion that the nature of the HIV peptides presented by host HLA molecules to CTL and Th cells may determine both the quality and the quantity of host anti-HIV cellular responses. Thus, one approach to design of HIV preventive immunogens that would induce salutary anti-HIV immune responses would be to design HLA-based HIV vaccines following analysis of the HLA alleles present in the cohort to be immunized and analysis of the most common HIV variants present in the geographic location of the cohort6. This article discusses some of the HLA-related issues relevant to the design of HLA-based HIV vaccine, and suggests an analytic approach that might be taken if suitable HLA and HIV data bases were established.

The human major histocompatibility complex (MHC), of which the HLA class I and class II genes are part, is characterized by the presence of several multigene families, extensive polymorphism at many loci, and significant linkage disequilibrium between alleles at particular loci. The extent of genetic polymorphism is still being determined but currently at least 59, 126, and 36 different alleles are known for HLA - A, -B, and -C loci, respectively7. Similarly, 135, 25, and 65 different alleles have been recognized for the respective HLA class II genes, HLA-DRB1, -DQB1, and -DPB18. In most populations, a few alleles are frequent (gene frequency >10%) but most occur at low frequency (less than 10%) and a number of the latter maybe rare (gene frequency <1%). As is the case for other genetic polymorphisms, the frequency of HLA alleles differs among populations. An allele that is common in one population may be rare in another. Some alleles are limited to particular ethnic populations, while others are widely shared among ethnically distinct populations.

For example, the allele HLA-A36 is found only among individuals having African ancestry9; on the other hand, serologically defined HLA-A2 occurs rather frequently in most populations studied world-wide to date10. It should be noted, however, that 15 different subtypes of A2 have been identified by molecular methods7, but the distribution of these variants in different populations is unknown. Using DNA sequencing analysis, recent population studies among South American Indian tribes have revealed new allelic variations, possibly the result of relatively recent mutations and positive selection pressures, that are limited, or unique to, a particular tribal group11, 12.

Linkage disequilibrium, a significant association between two alleles at different loci, has been well documented within the MHC13. The association of HLA-A1, -B8, and -DR3 in Caucasian populations is one of the strongest and most widely cited examples of this genetic phenomenon14 and is an HLA haplotype associated with rapid progression to AIDS in HIV-infected subjects15. Many other examples of HLA linkage disequilibrium have been observed in a variety of populations; but the genetic mechanism(s) that maintains HLA linkage disequilibrium remains unknown.

Given that extensive polymorphism is known to exist within the HLA system and that additional genetic variation is likely to be revealed as more populations are studied using molecular methods, is it feasible to develop an HLA-based HIV vaccine? Must one develop a unique vaccine for each ethnically homogeneous population? Or will it be possible to design one that is effective in a heterogeneous population or in several distinct populations? The answer to these questions await appropriate field testing; in this review, we explore the feasibility of developing HLA-based HIV vaccines based on currently available data.

In this edition of the Los Alamos database, Korber et al16 have compiled an extensive HIV CTL and Th epitope database and summarized relevant HLA restriction elements. Together with the distribution of the HLA alleles in target populations, these data provide information essential to the design of HLA-based HIV vaccines. Obviously, there are obstacles to HIV vaccine development in general and to design of HLA-based HIV vaccines in particular [rev in 17]. The obstacles include HIV variability, mutability, and the disparate geographic distribution of HIV variants. The HLA system provides an additional hurdle to design of HLA-based HIV vaccines. Detailed HLA gene frequency distributions, incorporating the incidence of recently defined alleles, do not yet exist for most human populations. Nor have HLA restriction elements for HIV CTL epitopes been defined for recently identified and rare HLA alleles.

Results obtained in eleven International Histocompatibility Testing Workshops, particularly the most recent ones18, 19, 20, 21, provide a valuable, although incomplete, HLA database for preliminary analyses for HLA-based vaccine immunogen design. Although data from several laboratories were pooled for the International Workshops, generally no attempt was made to obtain frequencies representative of the different ethnic groups; that is, individuals were not sampled randomly. However, more appropriate data may be available shortly for populations in the United States. Currently, federally funded projects to better define histocompatibility antigens in African and native Americans are underway, and a large amount of data collected by the National Bone Marrow Registry is being analyzed for distribution of HLA gene frequencies in different populations. Hopefully, more complete information will be obtained for other populations in the near future.

Recognizing these imperfections in available data, we have used the International Histocompatibility Workshop data19, 21, supplemented with published data from selected laboratories22, 23, as the best currently available estimate of the frequencies of HLA alleles that have been shown to serve as restriction elements for HIV CTL epitopes16. Table 1 summarizes these frequencies for the four populations (African Americans, North American Indians, USA Caucasians, and Thais) that we have considered in our analysis of HLA-based peptide vaccines. Section II of the Los Alamos HIV epitope database of Korber et al16 lists the CTL epitopes by HLA restriction element. Using these two sets of data and the Hardy-Weinberg theorem24, we have estimated the proportion of each of the four populations that would be predicted to present peptides to the immune system if a limited number of HIV epitopes were included in a vaccine designed specifically for that population. We have also made similar calculations for a vaccine designed to be immunogenic for all four populations. These results are presented in Table 2.

The strategy we have used in this analysis is to first identify the most frequent restriction elements in the population under consideration for vaccination (or common to the 4 populations), to identify peptides that are presented by more than one HLA allele, and then to seek commonality between these two lists. Probability calculations then utilize the frequencies of the commonality alleles supplemented by those of additional high frequency alleles in the population. Alleles are added until the proportion of the individuals in the population carrying one or more of the alleles in the list is at an acceptable level, greater than 90% in our examples. The aim here is to maximize the sum of the HLA gene frequencies that would recognize the least number of different HIV peptides to be included in an HIV immunogen. The next step is to choose the peptides associated with the restricting allele. In some instances only one peptide is associated with an allele while in others, multiple peptides are presented by the same allele.

The appropriate criteria for choosing which immunogenic epitopes should be included in a preventive HIV immunogen are not yet known, but possible criteria for consideration are listed below.

Peptides reported to be immunogenic in situations thought to reflect protection from retroviral infection or protection from retroviral-induced immunodeficiency disease (i. e., in non-progressors to AIDS).

Peptides presented to the immune system by HLA restricting elements reported to be associated with non progression to AIDS [rev in 4].

Peptides reported presented by several disparate HLA class I allotypes.

For the four population cohorts considered in this analysis, as few as 2 and as many as 5 epitopes are required to achieve a theoretical protection level of at least 90% (Table2). The different numbers of required epitopes reflect the relative amounts of HLA Class I polymorphism observed in the different ethnic groups and presentation of a peptide by multiple HLA class I molecules. Another factor is the limited number of HLA restriction elements-HIV CTL epitope pairs that have been described. To date, HIV peptides have been associated only with HLA restriction elements that are infrequent in some populations. As more data are accumulated for other epitopes, some that are associated with higher frequency restriction elements may be identified.

A comparison between the individual and combined populations (Table 2) demonstrates that one gains relatively little by including epitopes that are associated with low frequency alleles. The proportion of individuals protected approaches 100% asymptotically so that even adding on epitopes associated with high frequency alleles adds little to the proportion as this level is approached. This is illustrated by the North American Indians where including 6 more epitopes associated with 5 very low frequency alleles and one intermediate frequency allele in the combined theoretical vaccine added only 3.0% protection.

HLA-based HIV vaccine design is presently hampered by incomplete information regarding distributions of HLA alleles in different populations. Therefore, the vaccine design data presented here must be viewed with certain reservations. HLA allele frequencies used in calculations do not account for the many allelic variants that have been discovered recently; and although it is unknown at present whether or not all related variants can present the same HIV peptide, it seems unlikely. This is not a trivial problem, as illustrated in Table 3 where the number of currently recognized molecular variants is tabulated for those HLA class I alleles now known to present at least one HIV CTL epitope. Some of the more frequent alleles (e.g., HLA-A2, -B35, -B62) are known to include more than a dozen variants. The distribution of allelic variants in different populations is only beginning to be determined26.

Estimates of the proportion of the population able to respond to an HLA-based vaccine may be further complicated by linkage disequilibrium between certain HLA-A, -B, and -C locus alleles and lack of Hardy-Weinberg equilibrium. Use of HIV epitopes restricted by HLA alleles that are in strong linkage disequilibrium would reduce the proportion of individuals responding to the vaccine. In spite of deficiencies in our knowledge of HLA immunogenetics and of HIV epitope presentation to the immune system, a limited number of peptides appear to be sufficient for presentation to the majority of a population cohort.

These analyses represent the initial considerations for design of HIV HLA-based vaccines that might induce salutary anti-HIV Th and CTL responses. It is important to point out that the correlates of protective immunity to HIV are not yet known, and specifically, the need for anti-HIV neutralizing antibodies for protective anti-HIV immune responses is controversial [rev in 4]. Nevertheless, the database of Korber et al16 provides a new and powerful tool with which to begin to address these complex issues.

AcknowledgmentsThis work supported by grants AI135351 from the NIH and DAM D17-94-4467 from the Department of Defense