Abstract

Our knowledge of pharmacogenetic variability in diverse populations is scarce, especially
in sub-Saharan Africa. To bridge this gap in knowledge, we characterised population
frequencies of clinically relevant pharmacogenetic traits in two distinct South African
population groups. We genotyped 211 tagging single nucleotide polymorphisms (tagSNPs)
in 12 genes that influence antiretroviral drug disposition, in 176 South African individuals
belonging to two distinct population groups residing in the Western Cape: the Xhosa
(n = 109) and Cape Mixed Ancestry (CMA) (n = 67) groups. The minor allele frequencies (MAFs) of eight tagSNPs in six genes (those
encoding the ATP binding cassette sub-family B, member 1 [ABCB1], four members of the cytochrome P450 family [CYP2A7P1, CYP2C18, CYP3A4, CYP3A5] and UDP-glucuronosyltransferase 1 [UGT1A1]) were significantly different between the Xhosa and CMA populations (Bonferroni
p < 0.05). Twenty-seven haplotypes were inferred in four genes (CYP2C18, CYP3A4, the gene encoding solute carrier family 22 member 6 [SLC22A6] and UGT1A1) between the two South African populations. Characterising the Xhosa and CMA population
frequencies of variant alleles important for drug transport and metabolism can help
to establish the clinical relevance of pharmacogenetic testing in these populations.

Keywords:

Introduction

The field of pharmacogenomics aims to utilise the genetic composition of an individual
to personalise therapeutic regimens and improve treatment outcomes. Most of the initial
examples of the clinical utility of pharmacogenomics were elucidated for cancer treatments.
Currently, however, there are more than 15 drugs used in the treatment of a variety
of chronic diseases, such as cardiovascular disease, HIV/AIDS and seizures, for which
the US Food and Drug Administration (FDA) recommends or requires pharmacogenomic testing
to prevent drug-related toxicity or improve drug efficacy [1]. The increase in the number and breadth of drugs for which pharmacogenetic tests
are recommended or required by the FDA is an indication of the important role that
genetics plays in predicting treatment outcomes.

In order for pharmacogenetic testing to have the most impact in as many people possible,
it is important to understand which genetic variants are predictive of treatment outcomes
in diverse populations. Most pharmacogenetics studies to date have been conducted
in a limited number of population groups, most frequently in Western European and
North American Caucasians. As a result of these limitations, genotype-to-phenotype
correlates of drug response or toxicity for a number of drugs are clinically applicable
in relatively few treated individuals. Furthermore, pharmacogenetic profiles characterised
in Caucasians are often extrapolated for use and interpretation in other populations,
in spite of at least two major problems with this method. First, it is clear that
the population frequency of variants can differ markedly between populations, such
as Caucasians. The differences in population frequencies of variant alleles has an
impact on the clinical utility of pharmacogenetic testing, being more utilised in
populations with a higher frequency of the variant allele than in populations in which
the variant allele is rare. Secondly, ethnically specific variants exist in non-Caucasian
populations which are more predictive of treatment outcomes than those identified
in Caucasians. For example, although the UGT1A1*28 polymorphism is predictive of toxicity to the anticancer drug, irinotecan, in Caucasians,
the UGT1A1*6 polymorphism is more predictive of irinotecan toxicity in Asians [2]. Such ethnically specific variation is not currently taken into account in most commercially
available pharmacogenetic tests or on FDA drug labels.

African populations are among the most genetically diverse in the world [3]. In spite of this diversity, very few pharmacogenetics studies have been conducted
in African populations. In fact, it is documented that there is inter-ethnic variability
in pharmacogenetic traits between African populations [4]. Although the International HapMap Project has included three African populations,
the Yoruba of Nigeria and the Maasai and the Luhya of Kenya, the population genetics
of these three groups cannot represent the remaining populations in West and East
Africa, or other populations living in Southern Africa. In order to achieve the goal
of personalised medicine and individualisation of therapy in Africa, it is important
carefully and systematically to study pharmacogenetic traits in as many distinct African
population groups as possible.

To bridge gaps in pharmacogenetic mapping in African populations, especially those
residing in Southern Africa, this study prioritises the genotyping of 211 single nucleotide
polymorphisms (SNPs) in 12 genes known to affect drug absorption, transport and metabolism
in the Xhosa and Cape Mixed Ancestry (CMA) populations living in the Western Cape,
South Africa. These genes are relevant for the pharmacokinetic disposition of a number
of medications, including those used for the treatment of HIV infection, which is
having a devastating impact in the region. The Xhosa population is indigenous to the
Eastern Cape of South Africa, is the second largest ethnic group in South Africa and
comprises approximately 17.6 per cent (~8 million) of the South African population
[5]. The CMA population is known to have the highest rate of admixture worldwide, including
mixes of European, African, South Asian and Indonesian ancestry, and comprises 8.9
per cent (~4 million) of the South African population [5].

In this study, we characterised and examined differences in the minor allele frequency
(MAF) estimates of pharmacogenetic alleles between the Xhosa and CMA populations.
Secondly, we characterised haplotypes of the pharmacogenetic genes in both the Xhosa
and CMA groups, and, finally, the MAF estimates we obtained for the Xhosa and CMA
were compared with the HapMap estimates for other African, US and Asian populations.
Taken together, these data should help lay the foundation for future pharmacogenetics
studies in other South African populations, as well as the eventual use of pharmacogenetic
testing, where clinically relevant, for the South African population.

Materials and methods

Study design

Written informed consent for the collection, storage and extraction of genomic DNA
was obtained in English, Afrikaans or Xhosa from 176 unrelated HIV-positive South
Africans [6]. The DNA belonged to 109 Xhosa and 67 CMA individuals [7]. Ethnicity was determined by self-report. This study was approved by the individual
Committees on Human Research at Stellenbosch University, South Africa and the University
of California, San Francisco, USA.

Measurements

Two hundred and eleven tagging SNPs (tagSNPs) in 12 genes (Supplementary Table S1
(Table 4)) that are important for drug absorption, transport and metabolism of antiretrovirals
and other medications were selected using Snagger software [8]. This software takes into account different population frequencies of SNPs, as reported
in HapMap, to generate a representative list of SNPs [8]. Because little is known about South African population genetic substructure and
polymorphisms, HapMap Phase I, build 36 was used to select tagSNPs informative across
all four population samples (ie Caucasian, Yoruba, Japanese and Han Chinese). In this
manner, the likelihood of selecting for markers that may be informative in the Xhosa
and the CMA populations is increased. Other SNPs were force-included based on their
clinical pharmacogenetic relevance, as reported in the literature. All SNPs were included
on a custom SNP genotyping array and DNA samples genotyped using the Illumina GoldenGate
Assay kit (Illumina, Inc., San Diego, CA, USA).

Statistical analyses

Call rates, MAF and Hardy-Weinberg disequilibrium test p values were calculated using the R package. Chi-squared tests were used to test for
Hardy-Weinberg disequilibrium. When small observed numbers were present for one or
more genotype groups, Fisher's exact test was applied. Association analyses were performed
using the co-dominant genetic model to report on SNPs with significantly different
frequencies between the Xhosa and CMA groups. The significance criterion was set at
a Bonferroni-corrected p value ≤ 0.05. In order to improve the quality of the genotype data, the SNP call rate
was required to meet or exceed 90 per cent. The MAF of SNPs retained for association
tests was required to meet or exceed 5 per cent in the Xhosa.

The R package, haplo.stats, was used to infer haplotypes. A sliding window haplotype
association test was performed for each SNP represented in a given gene. This tests
for association between haplotypes and an outcome. Given an ordered (by chromosomal
locations) set of markers (1, 2, 3, . . ., n), sliding windows of overlapping haplotypes are tested in sequence (ie for window
size = 3, markers 1-2-3 are treated as a single haplotype, then markers 2-3-4 are
treated as a single haplotype, then markers 3-4-5, etc.). Haplotypes of varying sizes
(2-, 3-, 4-SNP haplotypes) are assessed within each gene for this dataset. This haplotype
test also assessed the association between identified haplotypes and outcome (in our
case, ethnicity), as previously described by another group [9].

Comparison of SNPs between the Xhosa and CMA populations

Six of the 12 genes studied (those encoding the ATP binding cassette (ABC) sub-family B, member 1 [ABCB1], cytochrome P450 [CYP] 3A45, UDP-glucuronosyltransferase 1 [UGT1A1], CYP2C18, CYP3A4 and CYP2A7P1), in descending order of significance, contained at least one tagSNP that differed
statistically between the Xhosa and the CMA populations (Table 1). The tagSNP results are presented in descending order of statistical significance
(Table 1). Among the six genes, the MAF of eight of the 211 genotyped SNPs were statistically
different between the Xhosa and the CMA, with the greatest difference found for the
ABCB1 SNP (rs13233308) (p = 1.77E-05; Table 1) and least difference found for ABCB1 SNP (rs1202184) (p = 0.0459).

Table 1. Significantly different TagSNPs in the Xhosa and CMA populations

CYP3A5 SNP (rs4546450) occurred at a frequency of 0.03 in the Xhosa and 0.17 in the CMA (p = 0.00393). Two SNPs in UGT1A1 were found to be statistically significantly different between the Xhosa and the CMA.
The first SNP in UGT1A1 (rs7572563) occurred at a frequency of 0.14 in the Xhosa and 0.32 in the CMA (p = 0.0108) and the second SNP in UGT1A1 (rs4148329) occurred at a frequency of 0.21 in the Xhosa and 0.41 in the CMA population
(p = 0.0445). One SNP in CYP2C18 (rs2860840) is undetected in the Xhosa but occurred at a frequency of 0.09 in the
CMA (p = 0.0148). A single SNP in CYP3A4 (rs2738258) occurred at a frequency of 0.35 in the Xhosa and 0.17 in the CMA (p = 0.0263). A SNP in CYP2A7 (rs11666982) occurred at a frequency of 0.22 in the Xhosa and 0.43 in the CMA (p = 0.0279).

Haplotype analysis

Based on the genotyped tagSNPs, haplotypes were constructed for each of the 12 genes.
There were no identifiable haplotypes in the Xhosa and the CMA in the following genes:
ABCB1, ABCC2, CYP2A7P1, CYP2B6, CYP2C19, CYP2D6, CYP3A5 or CYP3A7; however, haplotypes were identified in the gene encoding solute carrier family 22
member 6 (SLC22A6), CYP2C18, CYP3A4 and UGT1A1 (Table 2).

A total of four haplotypes were identified in the SLC22A6 gene. The four-SNP TAGG haplotype of SCL22A6 was found to occur at a significantly different frequency in the Xhosa (0.19) and
in the CMA (0.05) (p = 2.7E-04; Table 2). In CYP2C18, a total of three haplotypes were identified. The four-SNP AAGC haplotype of CYP2C18 occurred at a frequency of 0.40 in the Xhosa and 0.25 in the CMA (p = 3.0E-03). A total of seven haplotypes were identified in CYP3A4, which included the *1B SNP (rs2740574; Table 2). Six of the seven CYP3A4 haplotypes were significantly different in terms of the population frequency between
the Xhosa and the CMA. The four-SNP haplotype of CYP3A4 which differed the most between the groups was the GCAG haplotype, which occurred
at a frequency of 0.04 in the Xhosa, compared with 0.22 in the CMA population (p = 3.3E-05; Table 2).

A total of ten haplotypes in UGT1A1 were identified. Unlike the other genes, two different haplotype blocks were identified
in UGT1A1. One of the UGT1A1 haplotype blocks consisted of four SNPs, composed of five haplotypes. Two of the four
SNP haplotypes were significantly different in frequency between the Xhosa and CMA
South African populations (Table 2).

The second UGT1A1 haplotype block consisted of three SNPs, composed of five haplotypes. Two of the three
SNP haplotypes were found to be significantly different in frequency between the Xhosa
and the CMA (Table 2). The most significant haplotype difference was the GGA haplotype of UGT1A1, which occurred at a frequency of 0.12 in the Xhosa and 0.33 in the CMA (p = 5.2E-06).

Comparison of SNPs between South African, the HapMap African, US and Asian populations

A comparison of the MAF of 35 pharmacogenetic SNPs with known functional or clinical
associations in 10 genes (ABCB1, ABCC2, CYP2B6, CYP2C18, CYP2C19, CYP2D6, CYP3A4, CYP3A5, CYP3A7 and UGT1A1) in the Xhosa and CMA populations is presented in Table 3. The MAF of the SNPs do not differ statistically between the Xhosa and CMA populations,
except for CYP3A5 rs4646450 (p = 0.00393) and CYP2C18 rs2860804 (p = 0.0148).

Table 3 also shows a comparison between the allele frequencies obtained in the two distinct
South African population groups in our study and available reports for other African
populations, of which most data are known for the Yoruba from Nigeria and most recently
the Maasai and the Luhya tribes of Kenya. In addition, the table displays a comparison
of the allele frequencies in the African populations with other diverse populations
in the USA and Asia.

Discussion

In this study, we analysed the allelic variation of 211 tag SNPs in 12 genes that
are important in drug disposition and treatment outcome in two South African population
groups: the Xhosa and the CMA. We identified both single SNPs and haplotypes which
occurred at significantly different frequencies in the two populations.

In most sub-Saharan African countries, HIV/AIDS comprises one of the top socioeconomic
and health burdens. It is estimated that 25 per cent of the adult population living
in Southern Africa is infected with HIV, with an incidence of approximately 18 per
cent in South Africa alone [10]. Given the high incidence of HIV/AIDS in South Africa, the greatest impact of pharmacogenetics
may initially be made by improving treatment outcomes on antiretroviral therapy (ART).

In terms of the pharmacogenetic relevance of the ABC family of transporter genes, the evidence of their role in predicting HIV treatment-related
toxicity is inconclusive. The presence of the ABCB1 3435T allele is associated with a decreased risk of hepatotoxicity in HIV patients
treated with either efavirenz or nevirapine [11,12]. There is no conclusive evidence of the clinical significance of the ABCB1 1236C > T allele, however, although it appears minimally to affect the kinetics of
the immunosuppressant drug cyclosporine [13]. Both efavirenz and nevirapine, which are non-nucleoside reverse transcriptase inhibitors
(NNRTIs), are currently used in first-line treatment regimens of HIV-infected individuals
in South Africa [14]. Therefore, it would be important to assess the importance and contribution of the
ABCB1 variant alleles to drug-related toxicity with these NNRTIs. Parathyras et al. studied the association between a number of variants of ABCB1 and immune recovery in South Africans treated with ART and found no association between
the well-known 3435T allele and immune recovery;[7] however, an association was found between the ABCB1 G2677A SNP and immune recovery in this study [7]. Based on the results of the present study, it would be interesting to investigate
whether there is an association between the two ABCB1 tagSNPs (rs13233308 and rs1202184) found to be significantly different between the
Xhosa and CMA populations and immune recovery.

According to the South African Department of Health, the current second-line ART regimen
should include the anchoring agent lopinavir and ritonavir [14]. Therefore, pharmacogenetic traits of CYP3A4 and CYP3A5 may have an impact on the treatment outcome of second-line therapy. As protease inhibitors
are both substrates and inhibitors of CYP3A, however, the influence of CYP3A gene variation on ART treatment outcomes is difficult to discern [15] -- although the CYP3A4*1B variant allele is associated with variability in the pharmaco-kinetics of the
protease inhibitor indinavir [16]. In fact, homozygotes for the *1B variant have a lower bioavailability of indinavir
than heterozygotes and homozygotes for the *1A common allele [16]. Similarly, the common allele of CYP3A5 A6986G is associated with increased clearance of indinavir [17]. Similar association studies should be carried out to assess the contribution of
CYP3A variants to response or exposure to lopinavir in the South African population. It
would be interesting to assess the influence of the CYP3A5*6 variant that results in a loss-of-function of the CYP3A5 enzyme on lopinavir exposure
and treatment outcome in South Africans, as this allele is more common in people of
African descent than in Caucasians and Asians [18,19]. The variant occurred at a frequency of 0.2 in the Xhosa and 0.17 in the CMA populations
in the present study. In addition, the influence of the CYP3A4 SNP (rs2738258) and CYP3A5 SNP (rs4646450), both found to be statistically significantly different between the
Xhosa and CMA in the present study, on lopinavir exposure and treatment outcome should
be investigated.

The single SNPs and haplotype structures inferred for the Xhosa and the CMA populations
in the UGT1A1 gene could be used more accurately to stratify the two populations in order to perform
pharmacogenetic association studies. The South African HIV treatment guidelines changed
in April 2010, and first-line ART. Now includes tenofovir in addition to either nevirapine
or efavirenz and lamivudine [14]. Although both lamivudine and tenofovir are only nominally affected by CYP enzymes,
they are glucuronidated in the liver and excreted unchanged through the kidneys [15]. Therefore, studies could be designed to assess the impact of UGT1A1 SNPs (rs7572563 and rs4148329) and haplotypes on treatment outcomes of antiretroviral
drugs that may undergo glucuronidation prior to excretion, such as tenofovir. It would
make sense initially to genotype SNPs of the UGT1A1 haplotype with known functional alleles such as the UGT1A1*93 or the rs887829 SNP, both of which have been associated with hyperbilirubinaemia,[20,21] and assess their impact on the response to tenofovir.

It is clear that there are differences in the MAF of key pharmacogenetic alleles in
South African populations compared with other African populations (Table 3). Of particular interest, the loss of function CYP2B6*18 variant allele is thought to occur most frequently in West African populations,
with a reported MAF of 0.04 [22]. In the present study, however, we find that it occurs at a frequency of 0.17 in
the Xhosa and 0.09 in the CMA populations, compared with a reported frequency of 0.07
in the Luhya and 0.02 in the Maasai populations (Table 3). The CYP2B6 gene plays an important role in the metabolism of two of the first-line ART drugs
used in South Africa: efavirenz and nevirapine. The CYP2B6*18 SNP is the only coding SNP in CYP2B6 [23]. The variant is associated with elevated plasma concentrations of efavirenz and nevirapine
and hepatotoxicity in HIV patients from Mozambique treated with either drug [24-26]. To our knowledge, the present study is the first report on the MAF of the CYP2B6*18 variant in the Xhosa and the CMA populations. Given that efavirenz and nevirapine
are both first-line treatment agents in this region, further investigation of the
association between CYP2B6 null variant alleles and adverse reactions in South African populations is warranted.
Such findings have important implications for the incidence of adverse reactions to
efavirenz and nevirapine in different African populations.

The current study is the first of its kind systematically to characterise tagging
and clinically relevant pharmacogenetic SNPs in the two South African populations;
however, there are inherent limitations to our analyses. First, as this was a purely
descriptive study, there are no associations made with any disease (eg HIV) or treatment
outcomes; however, this work lays the foundation for the future study of such associations.
Secondly, whereas the sample size of the Xhosa sample is adequate, the sample size
of the CMA population is modest and it is possible that lower frequency alleles could
not be detected in this group. Thirdly, the frequency of SNPs typed in this study
were previously characterised in other populations and therefore we cannot rule out
the presence of novel SNPs for which we did not test in the Xhosa and CMA populations,
such as those recently reported in CYP2C19 and CYP2D6 in these populations [5,27]. Fourthly, the haplotypes inferred are limited by the sample size of our population
and there may be others that remain to be identified. Lastly, 12 genes that are known
to be associated with treatment outcomes in HIV infection were characterised but there
are likely to be more genes that remain to be studied.

Conclusion

To our knowledge, this is the largest pharmacogenetics study of two distinct South
African population groups. Our work shows that there are significant differences in
the frequencies of variant alleles in several genes (ABCB1, CYP2A7P1, CYP2C18, CYP3A4, CYP3A5 and UGT1A1) associated with treatment outcome in the Xhosa and the CMA populations of South
Africa. It also shows that for the majority of SNPs analysed, there is great similarity
in allele frequency between the two groups. Such work is of great importance for laying
the foundation for ethnicity-specific genotype-to-phenotype correlates of treatment
outcome for these various enzyme polymorphisms and their drug substrates. Importantly,
we also identified novel haplotype structures in four genes (CYP2C18, CYP3A4, SLC22A6 and UGT1A1) in the two distinct South African populations. The haplotypes could be used, in
addition to single SNPs, to more accurately stratify patient groups according to ethnicity
and to aid in identifying associations between causative variants and drug response.

It is clear from this work and that of others that not all African groups share the
same allele frequencies of key pharmacogenetic genes [4,5,7]. Therefore, it is important that studies such as this is performed in as many populations
as possible, to generate the most useful information on the clinical application of
pharmacogenetics for these specific populations. Caution is advised in using a single
African population in pharmacogenetics studies since it cannot be representative for
all Africans.

Acknowledgements

Dr Ikediobi is funded through an NIH research supplement to parent grant R01AI065233.
Dr Gandhi is funded through grant NIH K23 AI067065. Dr Aouizerat is funded through
the NIH Roadmap for Medical Research Grant (K12RR023262). Dr Warnich is funded through
the National Research Foundation (GUN 2054289) and the Medical Research Council.