Background

CpG islands in hepatitis B virus (HBV) genome are potential targets for methylation mediated gene silencing, and may be involved in the pathogenesis of HBV infection. To date, their characteristics in HBV quasispecies (QS) remain largely unknown. The purpose of this study was to investigate the characteristics of CpG islands in HBV QS.

Results

CpG island II (CGII) in AHB group was shorter in length and its QS heterogeneity was lower than that in the chronic infection group. Among the chronic infection subgroups, CGII and CpG island III (CGIII) in IT group were longer and their heterogeneity was lower compared to CHB and ACLF groups. Length of CGII correlated with HBV DNA levels positively while the complexity and diversity of CGII correlated with HBV DNA levels negatively. Moreover, CGII and CGIII were shorter in genotype B than those in genotype C, while QS complexity and diversity of either CGII or CGIII had no significant difference between genotype B and C.

Conclusions

Overall, our results suggest that the distribution, length and QS heterogeneity of CpG islands in full-length HBV genome differ across clinical phases of infection, of which the mechanism warrants further study.

Hepatitis B virus (HBV) infection is a challenging health problem and a leading cause of liver diseases in Asia–Pacific region. It is estimated that approximately 240 million people are chronically infected worldwide, and are at risk of developing end-stage liver diseases (Ott et al. 2012). Although the molecular mechanisms determining persistent infection are not fully elucidated, it is generally accepted that both the virus genome and host immune system contribute to the outcomes of infection.

CpG islands which are CpG-rich regions in HBV genome, are potential targets for methylation mediated gene silencing and are related with virus replication (Vivekanandan et al. 2010). There are three conventional CpG islands termed CpG island I (CGI), CpG island II (CGII) and CpG island III (CGIII) (Zhang et al. 2013). CGI (nt67 ~ nt212) is located in the start site of the S region, and CGII (nt1170 ~ nt1671) overlaps the enhancer I and the promoter of X region, while CGIII (nt2280 ~ nt2455) covers the partial C gene and encompasses the start site of the P region. Different distribution of CpG islands which can affect their methylation status and HBV gene expression, might further lead to different clinical outcomes of HBV infection. Previous studies have shown that distribution of CpG islands differed across HBV genotypes (Hou et al. 2015; Zhang et al. 2013; Zhong et al. 2015). However, all of the nucleotide sequences mentioned in those studies were searched from Genbank at the National Center for Biotechnology Information. To date, there is no datum from real-life study to explore the characteristics of CpG islands in HBV genome from different phases of infection.

It is worth noting that HBV exists as a spectrum of strains. Due to a high replication rate and lack of proofreading activity during reverse transcription, HBV exists as quasispecies (QS), including variants which are genetically distinct, but closely related (Ngui and Teo 1997). Because of the different adaptability, QS are related to the outcome of HBV infection (Cao et al. 2014; Yang et al. 2015) and antiviral response (Liu et al. 2011; Chen et al. 2009; Cheng et al. 2013; Peveling-Oberhag et al. 2013; Tong et al. 2013). Collectively, the characteristics of CpG islands in HBV QS isolated from real-life patients remain largely unknown.

In the present study, 599 clones from forty patients were isolated, and full-length HBV genomes were sequenced. Characteristics of CpG islands, including the distribution, length and heterogeneity in HBV QS, were investigated for better understanding the role of HBV in pathogenesis.

Patients

Forty treatment-naïve patients with HBV infection from Shanghai Ruijin Hospital were retrospectively enrolled in our present study. These patients were enrolled from September, 2009 to January, 2014. Among 40 patients, 10 patients were diagnosed as acute hepatitis B (AHB), 9 patients in high replicative, low inflammatory phase (previously termed “immune-tolerant HBV carriers”, IT) (Gish et al. 2015), 11 patients with chronic hepatitis B (CHB), and the other 10 with acute on chronic liver failure (ACLF). AHB is defined as a transient presence of HBsAg within 6 months without previous history of chronic hepatitis B. Diagnosis of IT, CHB and ACLF was according to criteria recommended by the Asian Pacific Association for the Study of the Liver (APASL) (Sarin et al. 2009; Liaw et al. 2012). IT is defined as HBeAg positive with high levels of HBV DNA (>2,000,000 IU/ml) but have normal serum alanine aminotransferase (ALT) (Liaw et al. 2012), while inclusion criteria of CHB group are HBsAg positive for more than 6 months and HBV DNA level >20,000 IU/ml with serum ALT level more than 2 ULN, in the present study (Yang et al. 2015). ACLF was defined as an acute hepatic insult manifesting as jaundice, coagulopathy, complicated within 4 weeks by ascites and/or encephalopathy in patients with previously diagnosed chronic HBV infection (Sarin et al. 2009). Sera of AHB and ACLF patients were collected during the first week of clinical onset. Patients were excluded if they were diagnosed as autoimmune liver disease, alcoholic liver disease, or co-infection, such as human immunodeficiency virus, hepatitis C virus, hepatitis D virus, Epstein-Barr virus, cytomegalovirus. Sera were collected from clinical detecting laboratory after completing clinical tests. The remaining sera were collected and froze in −80 °C refrigerator.

Compliance with ethical standards

The study was a non-invasive and non-interventional retrospective study. The study was fully anonymous, thus it cannot do harm to the patients. The study was approved by the Ethics Committee of Ruijin Hospital in accordance with the Declaration of Helsinki.

Statistical analysis

Length and QS complexity of CpG islands were expressed as mean ± SE, while the mean genetic distance was expressed as median with range. Results of continuous variables were compared between acute and chronic infection groups by unpaired t test or the Mann–Whitney test, and variables were compared between the chronic subgroups by one-way ANOVA analysis of variance or Kruskal–Wallis test as needed. Proportion of diseases was compared using Chi Square tests. Correlations were analyzed using Pearson correlation analysis. All analyses were performed using SPSS19.0 software (Chicago, IL, USA). Differences were considered significant at a P value <0.05.

Clinical and laboratory data of patients

The demographics, clinical and laboratory data are shown in Table 1. All AHB patients had spontaneous HBsAg/anti-HBs seroconversion, and five in ten ACLF patients died within 3 months after the clinical onset of infection. The proportions of patients harboring HBV genotype B and C did not differ significantly among the four groups (P > 0.05).

Table 1

Demographic and clinical features of patients (mean ± SE)

AHB (n = 10)

IT (n = 9)

CHB (n = 11)

ACLF (n = 10)

P value

Sex (M/F)

9/1

5/4

6/5

10/0

0.032

Age (year)

40.70 ± 2.62

28.56 ± 1.84a,b

35.55 ± 3.89

43.80 ± 2.82

0.008

ALT (U/L)

1608.5 ± 199.29

29.67 ± 3.76a,b

263.45 ± 131.04c,d

1171.57 ± 184.49

0.001

TBIL (μmol/L)

327.62 ± 130.01

16.01 ± 1.40a,b

32.95 ± 9.36c,d

301.79 ± 29.83

0.002

PTA (%)

77.89 ± 5.45

103.33 ± 3.33

98.79 ± 1.21

30.58 ± 2.65b,d,e

0.000

HBeAg (±)

10/0

9/0

10/1

2/8b,d,e

0.000

HBV DNA (log10IU/ml)

6.55 ± 0.45

8.02 ± 0.19a,b

7.09 ± 0.38

6.34 ± 0.26

0.011

Genotype (B/C)

4/6

3/6

6/5

5/5

0.777

The proportions of patients harboring HBV genotype B and C did not differ significantly among the four groups (P > 0.05)

Distribution of CpG islands in patients with different phases of infection

Experiments of molecular cloning and sequencing were performed in our previous study (Yang et al. 2015), and the characteristics of CpG islands in full-length HBV genomes of 599 clones were analyzed and new data were generated in the present study (Genbank submission numbers: KU963799–KU964397). The distribution of CpG islands in HBV-related liver diseases is shown in Fig. 1. All clones contained CGII and CGIII. Besides the three conventional CpG islands, CpG island IV (CGIV, nt332 ~ nt632) was identified. The frequency of strains containing CGIV in AHB, IT, CHB and ACLF group was 0 % (0/146), 11.19 % (15/134), 9.09 % (15/165) and 28.57 % (44/154), respectively. Strikingly, CGIV was much more common in ACLF group, compared to the other three groups (χ2 = 59.76, P < 0.01) (Fig. 1b).

Fig. 1

Distribution of CpG islands in HBV genome. a The vertical axes indicate the GC percentage, and the horizontal axes represent the HBV nucleotide sequence. The blue areas refer to the CpG islands, while the vertical red lines under the horizontal axes represent CpG dinucleotides. The horizontal arrows represent the open reading frames of preS1/preS2/S, X, preC/C and p genes, while the vertical arrow indicates the split CGII. CGI, CGII and CGIII coexist in the first graph, while CGI is absent in the second graph, and CGIV is present in the third graph. b The frequency of strains containing CGIV in AHB, IT, CHB and ACLF group was compared by Chi Square test. CGIV was much more common in ACLF group compared to the other three groups (χ2 = 59.76, P < 0.01)

QS characteristics of CpG islands in patients with acute and chronic infection

In an attempt to investigate the characteristics of CpG islands among different phases of infection, lengths and QS heterogeneity of CpG islands were compared (data shown in the Additional file 1: Table S1). As shown in Fig. 2a, length of CGII in AHB group was statistically shorter than that in the chronic infection group (P < 0.01). QS complexity and diversity of CGII in AHB group were lower than those in the chronic infection group (P < 0.01).

Fig. 2

QS characteristics of CpG islands in patients with acute and chronic infection. a CGII in AHB group was statistically shorter in length, and its QS complexity and diversity were lower than those in the chronic infection group. b QS complexity and diversity of CGIII were lower in AHB group than those in the chronic infection group

As for CGIII, there was no significant difference in length between acute and chronic infection groups. Similar to CGII, complexity and diversity of CGIII in AHB group were lower than those in the chronic infection group (P < 0.01) (Fig. 2b).

QS characteristics of CpG islands in patients with chronic infection

Comparison of QS characteristics among chronic infection subgroups is shown in Fig. 3. Among the subgroups, IT patients had longer CGII than CHB and ACLF patients. Compared to CHB and ACLF subgroups, QS complexity of CGII in IT subgroup was lower. As for QS diversity, it was lower in IT subgroup than that in ACLF subgroup.

Fig. 3

QS characteristics of CpG islands in patients with chronic infection. Length of CGII (a) and CGIII (b) were longer and QS complexity was lower in IT subgroup compared to the CHB and ACLF subgroups

Similar to CGII, CGIII in IT subgroup was the longest, and its QS heterogeneity was the lowest, while there was no significant difference between CHB and ACLF subgroups.

HBVDNA levels correlate with the length and QS heterogeneity of CpG island II. HBV DNA levels correlated positively with the length of CGII, and negatively with QS complexity and diversity of CGII

QS characteristics of CpG islands in HBV genotype B and C

Consistent with the previous studies (Zhang et al. 2013; Zhong et al. 2015; Hou et al. 2015), CGI and split CGII were much more common in HBV genotype B compared to genotype C. Nevertheless, 30 clones in genotype C from two patients (one CHB patient and one ACLF patient) contained CGI, while among the 16 clones in genotype B from a CHB patient, only three clones contained CGI.

As shown in Fig. 5, both CGII and CGIII in HBV genotype B were shorter than those in genotype C (P = 0.000 and 0.000, respectively). QS complexity and diversity of either CGII or CGIII had no significant difference between genotype B and C (P > 0.05). Pearson correlation analysis showed that QS complexity did not correlate with the length of CpG islands (P > 0.05).

Fig. 5

QS characteristics of CpG islands in HBV genotype B and C. CGII (a) and CGIII (b) in genotype B were statistically shorter than those in genotype C, while QS complexity and diversity of either CGII or CGIII had no significant difference between genotype B and C

In the present study, characteristics of CpG islands in HBV QS were investigated in real-life study for the first time. The results indicated that CGII in AHB group was shorter in length and its QS heterogeneity was lower than that in the chronic infection group. Among the chronic infection subgroups, CGII and CGIII in IT subgroup were longer and their heterogeneity was lower compared to CHB and ACLF subgroups. Moreover, HBV DNA levels correlated positively with the length of CGII, and negatively with the heterogeneity of CGII.

The full-length HBV sequence is needed for CpG islands analysis. The sequence obtained by splicing various PCR-amplified fragments, maybe is not a real existing full-length sequence because of the HBV QS. The classic method reported by Gunther et al. (1995) is still the gold standard for complete HBV genome study. In accordance with Gunther, all HBV strains in our present study were obtained by full-length cloning/sequencing technique instead of splicing PCR-amplified fragments.

For the first time, we compared the QS characteristics of CpG islands from patients with different outcomes of infection. It is generally considered that patients with acute self-limited infection have effective immune response to eliminate virus, while immune-tolerant patients have high levels of HBV replication and lack inflammation in the liver. The mechanisms of different outcomes caused by HBV infection are considered to be multi-factorial, including the immune suppression and virus factors (Cao et al. 2014). In the present study, QS complexity of CGII in CHB and ACLF groups was higher than that in IT groups. HBV QS evolution from immune-tolerance to immune-active phases may result from the immune selection.

CGII which overlaps the enhancer I and the X gene promoter, is an important region for the regulation of HBV transcription and replication (Guo et al. 2011). It has been reported that HBV core protein (HBc) binds to HBV cccDNA preferentially at the CGII region (Guo et al. 2011). The frequency of HBc binding to CGII is positively correlated with the ratio of relaxed circular DNA to cccDNA and the levels of serum HBV DNA (Guo et al. 2011). Another study reported that promoters with long CpG islands encoded more RNA polymerase II binding sites than that with short ones (Elango and Yi 2011). CGII in AHB group is shorter than that in the chronically infected group, and is longer in IT subgroup compared to CHB and ACLF subgroups. An interesting finding of our study is that HBV DNA levels correlate positively with the length of CGII, and negatively with the complexity of CGII. CGII, whose length is related to viral replication, may play a role in the clinical outcomes of HBV infection. Our study can partially elucidate mechanisms underlying the different outcomes of HBV infection.

Genotype B and C, which are two major prevalent HBV genotypes in China (Zeng et al. 2005; Chu and Liaw 2005), have a divergence of more than 8 % in the complete genome nucleotide sequence, and different clinical characteristics. In general, infection with HBV genotype B is apt to have HBV e antigen seroconversion, while infection with genotype C is associated with higher risk of developing cirrhosis and HCC compared to genotype B (Chu and Liaw 2005; Kong et al. 2014; Tseng and Kao 2008; Malmstrom et al. 2012). To date, the mechanism for the differences remains unclear. Our present study showed that CGI was more common and lengths of CGII were shorter in genotype B compared to genotype C. Since CGI overlaps the forepart of S gene, the different distribution of CGI between genotype B and C may affect the methylation of CpG islands, and influence the regulation of HBV gene expression, especially the S gene. Although the function of CGI remains unclear, its absence might induce less methylation of the first CpG-rich region in HBV genotype C (Zhang et al. 2013), and likely to be associated with the progressive liver diseases caused by HBV genotype C. To note, as shown in the present study, 30 clones in genotype C from two patients (one CHB patient and one ACLF patient) contained CGI, while among the 16 clones in genotype B from a CHB patient, only three clones contained CGI. Different from the previous studies (Hou et al. 2015; Zhang et al. 2013; Zhong et al. 2015), data from HBV quasispecies emphasize that, distribution of CpG islands does not abide by genotypes strictly.

Moreover, little is known about whether HBV genotypes affect the QS heterogeneity. Data from this real-life study showed that QS complexity and diversity of either CGII or CGIII had no significant difference between genotype B and C. Therefore, QS heterogeneity may not account for the different clinical outcomes between genotype B and C.

To our knowledge, this is the first real-life study that explored the QS characteristics of CpG islands in full-length HBV genome. Obtained data demonstrate that lengths and QS heterogeneity of CpG islands differ across clinical phases of infection. These data may partially explain the different clinical characteristics among clinical phases of infection, of which the mechanism warrants further study.

Authors’ contributions

XXZ and QMG made substantial contributions to conception and design. SYH, YX, FL, and DMY performed the experiments. YX, MJW, ZTY, SYH, YH, DH, MYZ and DHZ participated in data collection and confirmation. YX and MJW analyzed the data and drafted the manuscript. XXZ and QMG revised the manuscript critically. All authors read and approved the final manuscript.

Availability of data and materials

All sequences in the present study has been submitted to Genbank, and the submission numbers are from KU963799 to KU964397.

Competing interests

The authors declare that they have no conflict of interest.

Ethics approval and consent to participate

The study was approved by the Ethics Committee of Ruijin Hospital in accordance with the Declaration of Helsinki (Ethics committee’s reference number 201617). The study was fully anonymous, and it cannot do harm to the patients. The study was a non-invasive and non-interventional retrospective study, thus the consent to participate can be waived according to the ethics approval.

Funding

This work was supported by grants from the National Natural Science Foundation of China (81171616, 81371860), the Major National S&T Projects for Infectious Diseases (2012ZX10002007, 2013ZX10002001), the State Major Basic Research Program (973) of China (2012CB 519000).

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.