Abstract

A retrospective, international, multicenter study was undertaken to assess: (i) the prognostic role of ‘interim’ positron emission tomography performed during treatment with doxorubicin, bleomycin, vinblastine and dacarbazine in patients with Hodgkin lymphoma; and (ii) the reproducibility of the Deauville five-point scale for the interpretation of interim positron emission tomography scan. Two hundred and sixty patients with newly diagnosed Hodgkin lymphoma were enrolled. Fifty-three patients with early unfavorable and 207 with advanced-stage disease were treated with doxorubicin, bleomycin, vinblastine and dacarbazine ± involved-field or consolidation radiotherapy. Positron emission tomography scan was performed at baseline and after two cycles of chemotherapy. Treatment was not changed according to the results of the interim scan. An international panel of six expert reviewers independently reported the scans using the Deauville five-point scale, blinded to treatment outcome. Forty-five scans were scored as positive (17.3%) and 215 (82.7%) as negative. After a median follow up of 37.0 (2–110) months, 252 patients are alive and eight have died. The 3-year progression-free survival rate was 83% for the whole study population, 28% for patients with interim positive scans and 95% for patients with interim negative scans (P<0.0001). The sensitivity, specificity, and negative and positive predictive values of interim positron emission tomography scans for predicting treatment outcome were 0.73, 0.94, 0.94 and 0.73, respectively. Binary concordance amongst reviewers was good (Cohen’s kappa 0.69–0.84). In conclusion, the prognostic role and validity of the Deauville five-point scale for interpretation of interim positron emission tomography scans have been confirmed by the present study.

Introduction

2-[18F] fluoro-2-deoxy-D-glucose (FDG) positron emission tomography (PET) is a standard procedure for staging and restaging in Hodgkin lymphoma and in several subtypes of non-Hodgkin lymphoma.1,2 PET, combined with helical multi-detector computed tomography (PET-CT), performed during and after therapy has a high prognostic value for predicting first-line treatment outcome in Hodgkin lymphoma and diffuse large B-cell lymphoma.3–9 Interim PET was reported to be a better predictor of prognosis than the International Prognostic Score (IPS) when carried out after two cycles of treatment with doxorubicin, bleomycin, vinblastine, and dacarbazine (ABVD) in a joint Italian-Danish study (JID).8 In a recent meta-analysis, overall sensitivity for the prediction of treatment outcome in Hodgkin lymphoma treated with ABVD ranged from 43% to 100% and specificity from 67% to 100%.10 However, the studies included were not directly comparable, with different timing of the interim PET during the course of treatment and differing PET methodologies. Most studies used stand-alone PET, which has now been replaced by PET-CT. Reporting methods were not consistent making it difficult to judge how these results should be applied in clinical practice.

In 2009 an international meeting attended by hematologists and nuclear medicine specialists was held in Deauville, France, with the intention of defining simple and reproducible criteria for interim-PET reporting in lymphoma.11 A five-point scale (5-PS) developed at Guy’s and St. Thomas’ Hospital in London was adopted12 as the “Deauville criteria”. An international study was launched to compare previous reports on the accuracy of interim PET in predicting treatment outcome in Hodgkin lymphoma with an international cohort of patients scanned using PET-CT after two cycles of ABVD and to evaluate the reproducibility of the 5-PS among reporters. The criteria for enrollment, the breakdown of patients according to stage (early unfavorable and advanced-stage) and the endpoints were the same as in the JID.

Methods

Retrieval of patients’ data

Consecutive patients affected by Hodgkin lymphoma from participating centers worldwide diagnosed between January 2002 and December 2009 were retrospectively enrolled with the following inclusion criteria: (i) stage IIB to stage IVB or stage IIA Hodgkin lymphoma with adverse prognostic factors (at least three nodal sites involved, sub-diaphragmatic presentation, bulky disease, and erythrocyte sedimentation rate > 40 mm/h); (ii) treatment with four to eight cycles of ABVD with or without involved-field radiotherapy or consolidation radiotherapy; (iii) staging with PET/CT at baseline and after two courses of ABVD (PET-0 and PET-2, respectively); (iv) no change to treatment based on interim-PET results; and (v) a minimum follow-up of 1 year after completion of first-line treatment. Patients escalated to salvage treatment during ABVD chemotherapy were eligible only if the treatment change was based on clinical and/or radiological evidence of disease progression/resistance.

The study was approved by the ethical committee of the coordinating center in Cuneo (Italy) and conducted according to the Helsinki declaration. Specific informed written consent was not required as all data were retrospectively collected in an anonymized format, in agreement with specific institutional and national requirements. AG, SC and ER analyzed the data and all co-authors had access to the primary data.

Clinical data on 400 patients were collected; however only 335 paired scans (baseline and interim) were available for review. Of these, 75 were then excluded because there were no CT data (n=21), no baseline PET (n=25), no interim PET (n=1), missing CT slices (n=3), missing PET slices (n=10), poor quality PET images (n=6) or miscellaneous reasons (n=9). Complete data from 260 patients were available for analysis from 17 international academic institutions.

The Cotswold criteria were used to assign baseline stage and define bulky disease.13 The IPS was calculated for advanced-stage Hodgkin lymphoma as previously reported.14 A PET/CT scan was performed in all patients at the end of treatment and response was assessed according to International Working Group criteria.2 Disease progression was defined as new disease within 6 months of first-line treatment and relapse as disease occurring 6 months or longer after achieving complete remission. The primary endpoints of the study were the accuracy of interim-PET scans performed after two courses of ABVD (PET-2) at predicting treatment outcome and progression-free survival. The secondary endpoint was inter-observer agreement using the 5-PS for PET-2 interpretation. The procedures for PET scan review and statistical considerations are reported in the Online Supplement. In brief, six reporters reviewed baseline and PET-2 scans independently and scored them according to the Deauville criteria, with scores 1–3 regarded as ‘negative’ and scores 4 and 5 as ‘positive’.

Results

Clinical results

The clinical characteristics of the enrolled patients, the results of interim PET and the treatment delivered are presented in Table 1. PET-2 was performed a median of 12.3 ± 4.9 days (range, 7–22) after the day 15 administration of the second ABVD cycle. Fifty-three patients had unfavorable stage IIA disease. For 32 of these patients, the treatment consisted of four cycles of ABVD plus involved-field radiotherapy, while for the other 21 the treatment was six cycles of ABVD without radiotherapy, according to the preference of the treating physicians. Two-hundred and seven patients with stage IIB–IVB were treated with six to eight cycles of ABVD with or without consolidation radiotherapy. Consolidation radiotherapy was given as an integral part of combined modality treatment after ABVD for advanced-stage (IIB–IVB) Hodgkin lymphoma as originally described,23 with a high dose of radiation delivered to nodal sites of initial bulky lymphoma (30–36 Gy), and an optional additional boost (6 Gy) if there was residual disease at the end of chemotherapy. Consolidation radiotherapy was given to 68 patients (41 with stage IIB, 17 with stage III and 10 with stage IV disease).

Two hundred and thirteen patients (81.9%) achieved complete remission and two reached a partial remission, which subsequently converted in both cases to complete remission. Treatment failed in 45 patients (17.3%), of whom 33 (73.6%) had a positive PET-2 and 12 (26.6%) had a negative PET-2; the treatment failure in these 45 patients consisted of disease progression in 34 cases and relapse in the other 11.

After a median follow-up of 37.0 months (range, 2–110), the 3-year progression-free survival of the entire population was 83%. For PET-2 positive and PET-2 negative patients the 3-year progression-free survival rates were 28% and 95%, respectively (P<0.0001) (Figure 1). At the last follow-up, 252 patients were alive and eight had died. Death was due to disease progression (n=5) and to toxicity related to second-line treatment (n=1) in six patients with positive PET-2 scans and to disease progression or death while in complete remission in two patients with negative PET-2 scans. The 3-year overall survival rate for the entire population was 97% and for patients with PET-2 positive and PET-2 negative scans was 87% and 99%, respectively. The 3-year progression-free survival rates for PET positive and PET negative patients were not significantly different from the 3-year progression-free survival rate in the JID study (Mantel-Haenszel test, P=0.22; log-rank test, P=0.26; Wilcoxon test, P=0.66; Tarone-Ware test, P=0.45). The median follow-up for patients alive at last follow-up was 37.4 months (range, 1.8–109.9).

Treatment failed in 45 patients, 33 of whom (73%) had a positive PET-2; second-line treatment for these patients consisted of chemotherapy (n= 43), followed by autologous stem cell transplantation in 25. Two patients received only involved-field radiotherapy. Among the 45 patients experiencing treatment failure the median time from PET-2 to second-line treatment was 3.39 months (range, 0–37). For the 33 patients with a positive PET-2 scan, treatment failure was due to progression in 27 and relapse in six. At last follow-up, 22 of these 33 patients were in continuous complete remission following salvage treatment and 11 had progressed. Evidence of treatment failure was based on clinical evidence alone in three patients and on clinical and CT evidence in 30. Two patients had biopsy confirmation of relapse of the Hodgkin lymphoma. In the group with a negative PET-2 scan, 12/215 (5%) had treatment failure (7 had treatment intensification for disease progression and 5 for relapse), while 203/215 patients (94.4%) remain in continuous complete remission. To summarize, 33 patients had true positive PET-2 results, 12 had false positive results, 203 had true negative results and 12 had false negative results for PET-2 (Table 2). The percentage of patients with a positive PET-2 was higher (31%) among patients with a high IPS score (≥3) than among patients with a lower IPS of 0–2 (13%) (Figure 2). In 40 of 45 patients in whom treatment failed, information was available regarding the site of relapse: 30 relapsed in a site involved by disease at baseline, seven relapsed at a new site and three relapsed in initially involved and new sites. In 14/40 patients who received radiotherapy as part of their primary treatment, five relapsed in an area outside the irradiation field and for nine patients the site of relapse was not known.

The association between clinical variables and progression-free survival using univariate and multivariate analyses is reported in Table 3. The predictive value of PET-2 for treatment outcome was independent of the IPS, as previously reported in the JID study8 (low IPS/PET-2 positive, log-rank 0.81; low IPS/PET-2 negative, log-rank 0.66; high IPS/PET-2 positive log-rank 0.97; high IPS/PET-2 negative log-rank 0.67) (Figure 1).

Positron emission tomography scanning and review of results

The details of the PET-2 interpretation are the subject of a companion report that was published recently.24 Patients were scanned according to the routine protocol at participating PET centers. PET-2 scans were performed 12.3 ± 4.9 days (range, 7–22) after the administration of the day 15 chemotherapy during the second ABVD cycle. Administered FDG activity was 359 ± 85 MBq (range, 85–699) for PET-0 and PET-2. The uptake time (the interval between the FDG injection and PET acquisition) was 79±24 minutes.

Flow chart showing the clinical outcome of patients according to IPS group and PET results after two cycles of ABVD. CR, continued complete remission; PRO primary refractory to chemotherapy/progression within 6 months after completion of therapy; REL, late relapses after initial remission.

All six reviewers agreed independently as to whether the scan was ‘positive’ or ‘negative’ in 212/260 cases (82%). There was agreement between at least five reviewers in 240 cases (92%) and between at least four reviewers in 252 (97%). Disagreement among reviewers, when opinion was equally divided as to whether the scan was ‘positive’ or ‘negative’ was observed for only eight patients (3%). Discordant cases included problems in interpretation of residual marrow uptake (n=2), misinterpretation of uptake in a pleomorphic adenoma (n=1), missed sites of disease (n=2) and differentiating disease from physiological uptake related to the gut (n=1), brown fat (n=1) and blood vessels (n=1). Agreement between pairs of reviewers was good and ranged from 0.69 – 0.84 (using Cohen’s kappa). A dedicated session was held with all the reviewers to reach consensus upon the eight discordant cases. At the end of the review process for all 260 cases, 45 (17.3%) were defined as positive (score 4 or 5) and 215 (82.6%) as negative (score 1 to 3). The PET scan review results are shown in Table 3. Data are also available for PET-2 interpretation by local PET sites: the percentage of PET-2 positive and negative cases was 27.3% and 72.7%, respectively and the 3-year progression-free survival rate was 54% and 94%, respectively.24

Discussion

The present study confirms the ability of interim PET scans to predict treatment response accurately in Hodgkin lymphoma, as reported in earlier studies using varied criteria for reporting scans.4–5,8 This study involved a multi-center, international cohort of patients, representative of current clinical practice. PET/CT scans were retrospectively reviewed by a panel of nuclear medicine experts using predefined criteria, the 5-PS. The 3-year progression-free survival for patients with PET-2 negative scans was significantly superior to that of patients with PET-2 positive scans (95% and 28%, respectively, P<0.0001). The 3-year progression-free survival was not statistically different from that reported in the JID study (95% and 13% for PET-2 negative and positive cases, respectively; P<0.0001). Importantly, the current study demonstrated a very high NPV (94%) and overall accuracy (91%) for predicting treatment outcome using interim PET. The 215 out of 260 enrolled patients (82%) with a negative PET-2 scan had very good long-term disease control and only 12/215 (5%) of them failed to respond to ABVD induction therapy. These results confirm that ABVD with or without consolidation radiotherapy, whatever the IPS score, is an adequate treatment for approximately 80% of patients with advanced-stage Hodgkin lymphoma.4,5,8–10,25,26 Currently, it remains uncertain whether optimizing the timing of interim PET imaging during treatment could further improve the NPV of the interim-PET scan27 and prospective studies to answer this question are underway.28–39

The proportion of patients in whom ABVD treatment was unsuccessful in controlling disease (17.3%) is in line with that in earlier studies.23 Treatment failed in 45 patients: 34 patients progressed within 6 months of primary treatment, 11 patients relapsed later. After a mean follow-up of 37 months (range, 2–110), 18% of patients with early progression and 9% of relapsed patients have died, suggesting that primary refractory disease has a worse prognosis. The IPS did not reliably predict outcome for patients with progressive disease: only 15/34 (44%) patients with early progression had IPS ≥3, whereas 27 (79.4%) had interim PET scans that were scored as positive. Among the relapsed patients, the proportion of patients with IPS ≥ 3 and ‘positive’ interim scans was the same (36.4%). Only a positive interim PET result and the presence of bone marrow involvement were independent predictors of overall patients’ outcome after ABVD treatment in a multivariate analysis.

In the present study, the PPV, NPV, sensitivity and specificity of interim PET were 73%, 94%, 73% and 94%, respectively, in keeping with the reported literature10 and similar to the JID study, albeit with slightly lower PPV and sensitivity. Possible reasons for the lower PPV are that the reviewers were blinded to clinical information in the present study whereas in the JID study reviewers had access to clinical information that might affect interpretation e.g. the presence of co-existing infection, other unrelated diseases and the use of granulocyte colony-stimulating factors. Details of reasons for false positive uptake have been reported separately24 and included parotid adenoma, focal FDG uptake in a lung hilum previously involved by disease in a patient with concomitant respiratory infection (i.e. reactive lymph node), and bony fractures. The false negative rate might have been influenced by the choice of cutoff for a positive scan using the 5-PS, scores 1–3 being considered negative and scores 4 and 5 as positive. However, this cutoff was intended to achieve the optimal trade-off between sensitivity and specificity. Twelve patients had a false-negative interim PET scan; the score was 3 in seven patients, 2 in one patient, and 1 in four patients. The negative predictive value was 87% for scans scored as 3 and 97% for scans scored as 1 or 2.

There was also considerable variation in PET acquisition protocols because of the retrospective nature of the current study, which might have rendered the scan interpretation suboptimal. The JID study was prospective with consistency in the planning and acquisition of patients’ imaging data. This point underlines the importance of standardization of PET methods in future prospective studies.40,41

Several ongoing prospective trials are now exploring PET response-adapted treatment in patients with early and advanced-stage Hodgkin lymphoma.28–39 Standardized criteria for the interpretation of interim PET scans have been proposed11 and warrant validation. The 5-PS used in this study as a reporting tool for graded visual assessment has the advantage that it can be adapted to take into account the timing of the scan in relation to chemotherapy, the clinical context and the research question/endpoint.12 The good interobserver agreement observed in preliminary reports12,24 has been confirmed in our international cohort of patients with advanced Hodgkin lymphoma. Overall and binary concordance rates among six reviewers was good (kappa = 0.69 – 0.84), suggesting that the 5-PS is a robust and reproducible reporting tool. Preliminary studies in Hodgkin lymphoma reported a small percentage of scans (around 5–10%) with “minimal residual uptake” that was regarded as equivocal for the presence of disease.9,42 However the treatment outcome for these patients was similar to that of patients with a negative PET-2 scan. Using the newly proposed 5-PS score, the central review in the present study regarded PET scans with score 3, equivalent to “minimal residual uptake”, as ‘negative’12,16,43 and this may account for the increased specificity and PPV compared to local interpretation.24

This is the first study reporting treatment outcome according to interim PET using the 5-PS in Hodgkin lymphoma. Three clinical trials are underway at the time of writing based on a PET-2 response-adapted strategy using the 5-PS criteria for reporting. The percentages of patients with PET-2 positive scans in the RATHL study led by the UK National Research Cancer Institute (NRCI), the HD0607 trial by the Italian Lymphoma Foundation (GITIL/FIL), and the S0816 study by the U.S. South West Oncology Group and Cancer Acute Leukemia Group (SWOG-CALG-B)44–46 were 19%, 20%, 18%, respectively, in keeping with the JID study8 and other published data.4,5,42 These observations and the good interobserver reproducibility reported for the 5-PS criteria in Hodgkin lymphoma by others12,24 support the recommendation of the use of these criteria in routine clinical practice.

In conclusion, this study has confirmed the prognostic power of interim PET for predicting treatment response in ABVD-treated patients with Hodgkin lymphoma and the feasibility and good interobserver variability of the 5-PS for the interpretation of interim PET scans. These results are of significant importance for the development of response-adapted strategies using interim PET in Hodgkin lymphoma. Ongoing prospective clinical trials will help to determine whether PET-guided therapy will improve patients’ outcomes compared to that achieved with conventional ABVD-based treatment.

Acknowledgments

The authors would like to thank: Anna Cavallo, from the Hematology Department in Cuneo, for her kind assistance with data editing and text formating; Fabrizio Bergesio and Emanuele Roberto for technical assistance with figures and statistical analysis editing; Piergiorgio Cerello from the National Institute of Nuclear Physics (INFN) in Turin, Italy for providing the technical support for the imaging exchange tool developed by Alexandru Stancu; and Keosys company for providing the Positoscope® network to distribute images to reviewers.

. The predictive value of positron emission tomography scanning performed after two courses of standard therapy on treatment outcome in advanced stage Hodgkin’s disease. Haematologica2006;91(4):475–81.