Dose calculation algorithm is one of the main sources of uncertainty in the radiotherapy sequences. The aim of this study was to compare the accuracy of different inhomogeneity correction algorithms for external photon beam dose calculations. The methodology was based on International Atomic Energy Agency TEC-DOC 1583. The phantom was scanned in every center, using computed tomography and seven tests were planned on three-dimensional treatment planning systems (TPSs). The doses were measured with ion chambers and the deviation between measured and TPS calculated dose was reported. This methodology was tested in five different hospitals which were using six different algorithms/inhomogeneity correction methods implemented in different TPSs. The algorithms in this study were divided into two groups: Measurement-based algorithms (type (a)) and model-based algorithms (type (b)). In type (a) algorithms, we saw 7.6% and 11.3% deviations out of agreement criteria for low- and high-energy photons, respectively. While in type (b) algorithms, these values were 4.3% and 5.1%, respectively. As a general trend, the numbers of measurements with results outside the agreement criteria increase with the beam energy and decrease with advancement of TPS algorithms. More advanced algorithm would be preferable and therefore should be implanted in clinical practice, especially for calculation in inhomogeneous medias like lung and bone and for high-energy beams calculation at low depth points.

In radiation therapy, verification of precision and accuracy of treatment planning dose calculation is of great importance in achieving tumor eradication and sparing healthy tissue from unnecessary radiation dose. [1],[2] Reduction of errors and uncertainties plays an important role in the outcome of radiotherapy treatment. Based on clinical dose-response curves, the overall accuracy of the dose delivery should be less than 5%. [3] One step in achieving this degree of homogeneity is the accuracy of treatment planning systems (TPSs) in calculating the delivered dose to the patient, both at the dose specification point and in the surrounding tissue. [4]

Commercial clinical TPSs currently use a range of dose calculation algorithms. [5] The dose calculation in tissue inhomogeneities is very different from the calculation in water that it is one of the main problems faced in TPS design. [6] Although the influence of inhomogeneities on the primary photon fluence is generally well-predicted, but the dose delivered by scattered radiation is often approximated in a crude way. Most inhomogeneity correction algorithms are accurate for only a limited set of simplified geometries. Several authors have studied these errors. [7],[8],[9],[10] As a result of these studies, large dosimetric errors may occur in clinically relevant situations. [7] Also, dose calculations with these kinds of algorithms, applied to locations like lung, might be clinically unacceptable. [6]

In the past years, the most important activities in TPSs quality assurance were limited to dose calculation verification and some researchers as Westermann et al., in 1984, [11] Rosenow in 1987, [12] and Wittkamper et al., in 1987, [13] presented comparison results of measurement and calculations for some limited geometries.

Over the years, experimental studies have shown that the presence of low-density inhomogeneities in areas such as lungs can lead to a greater than 30% change in the water dose data. The effects of high-density inhomogeneities in areas such as bones are not well- studied, but significant local effects are expected. [14]

Dose calculation algorithms in TPSs can be broadly classified into measurement-based and model-based approaches. Measurement-based models, such as the Clarkson algorithm, compute dose-based on measurements in water. These models usually correct the homogeneous water distributions to account for treatment aids, patient contours, and tissue inhomogeneities. While, model-based approaches, such as the convolution/superposition (CS) algorithm [15] compute the dose in water or patient from physics principles. [16]

On the subject, there are several documents at international level, as the technical report by International Atomic Energy Agency Technical Reports Series 430 (IAEA TRS 430) [17] which recommended dividing the verifications into benchmark, generic beam, and user's beam data verifications. Other documents from other communities are the American Association of Physicists in Medicine (AAPM) Report 85 [18] and the European Society for Therapeutic Radiology and Oncology (ESTRO) booklet no. 7 [19] The IAEA has developed a set of practical clinical tests for TPSs, based on its TRS 430, to help users verify the dosimetric accuracy of their systems in its TEC-DOC 1583. [20],[21]

The aim of the present study was to compare the calculation accuracy and reliability of several TPSs that using different algorithms for photon dose calculation in external beam radiotherapy according to IAEA TEC-DOC 1583. All the included TPSs were commercially available and are currently in use by different radiotherapy centers in Iran.

Materials and Methods

Phantom

For clinical test measurements, the commercially available semianthropomorphic 002LFC CIRS Thorax phantom (CIRS Inc., Norfolk) was used. The phantom has a body made of plastic water TM , lung and bone equivalent materials sections with 1.003, 0.207, and 1.506 electron density relative to water, respectively. Ten holes hold interchangeable rod inserts for an ionization chamber. The holes were identified as shown in [Figure 1]. The phantom was scanned in each hospital imaging center using computed tomography (CT). The scans are used to obtain CT numbers to the relative electron density conversion curve and for the planning of clinical tests. The local scanning protocol with a slice spacing of 2 mm was used. CT images of the phantom were transferred in DICOM format to the TPS either through the local network or via digital media.

Figure 1: Position of measurement holds in CIRS phantom. Plugs number 1, 2, 3, 4, and 5 are tissue equivalent materials; plugs number 6, 7, 8, and 9 are lung substitute materials and plug number 10 is bone substitute material[18]

CT number to the relative electron density conversion was checked and if needed adjusted prior to image transfer in the institution. The acceptance criteria of 20 Hounsfield units for the difference between CT numbers for the same relative electron density were applied. This allowed minimizing possible deviations that may occur due to the difference in CT conversion tables used in each TPS. A set of clinical tests recommended by IAEA TEC-DOC 1583 was used to verify a range of basic treatment techniques applied in the clinical practice. The description of tests, reference, and measurement points are given in [Table 1].

The same set of tests was applied to all hospitals (The test no. 5 was not conducted due to lack of Multi Leaf Collimator in most of the studied radiotherapy centers). They were planned and the number of monitor units/time to deliver the prescribed dose of 2 Gy to the reference point was calculated. The dose calculations were performed for each available algorithm based on the grid size normally used in the institute's clinical practice.

Treatment planning system

Six different algorithms/inhomogeneity correction methods include equivalent tissue- air ratio (ETAR) from CorePLAN TPS (Seoul Cand J, Inc, South Korea), equivalent path length (EPL) from RTDose TPS (Math Resolutions, Columbia), Batho, CS, and collapsed cone (CC) methods from ISOgray TPS (Dosisoft, France) and full scatter convolution (FSC) method from TiGRT TPS (LinaTech, USA) were investigated [Table 2]. The full description of implemented calculation algorithms and inhomogeneity correction methods on studied TPSs are beyond the aims and the scope of this publication and readers are referred to the publish data elsewhere. [2],[7],[18],[20]

Table 2: Algorithms/inhomogeneity correction methods used in this study

Measurement based algorithms which include ETAR, EPL, and Batho methods

Model-based algorithms which include CS, CC, and FSC methods

Measurements

Measurements were performed in five hospitals using different accelerators units. Nominal photon energies of 6 and 18 MV from the Varian Clinac series accelerators (Varian Medical Systems, Palo Alto), 6 and 15 MV beams from Siemens Primus accelerators (Siemens Medical Solutions, Erlangen), and 6 and 15 MV beam from Elekta Precise accelerators (Elekta Oncology Systems, Crawley) were used. The photon beams were divided into two groups according to their energy: Low-energy X-ray (6 MV) and high-energy X-ray (15 and 18 MV) beams. To reduce the personal errors, each measurement has been performed for 3 times.

Farmer-type ionization chamber PTW30010 with UNIDOS electrometer (PTW, Freiburg) was employed for the phantom measurements. The chamber was positioned in the middle of the plug. The chamber and electrometer have a calibration traceable from Iran National Secondary Standard Dosimetry Laboratory. Pressure and temperature were measured for each measurement.

Analysis of the results

For evaluation of the measured (D meas ) and TPS calculated (D cal ) values, the same criteria as specified in IAEA TEC-DOC 1583 were employed. Also, the dose differences were normalized to the dose measured at the reference point for each test according to this protocol as the following equation:

Where D meas, ref is the dose value measured at the reference point. The agreement criteria for each test are listed in [Table 1].

Results

The differences between measured and calculated doses for different algorithms and tests are presented in [Figure 2] and [Figure 3]. The results are grouped according to the inhomogeneity correction algorithms implemented at the studied TPS. If the same TPS and algorithm were used in several institutions, then the data were pooled together and the mean errors with two standard deviation error bars are reported. The value of agreement criteria for each measurement point is shown as a thick red line. The difference for the four-field box test (case 4) is given as the average value for the four fields and for three-field tests (cases 7 and 8) two values are reported: Anterior field (ant) and average of lateral fields (lat). In these fields, the error deviations are shown with two tolerance error bars. This is done to limit the amount of data presented here.

Figure 2: (a) Difference between measured and calculated point doses for EPL algorithm in different photon energy. (b) Difference between measured and calculated point doses for ETAR algorithm in different photon energy. (c) Difference between measured and calculated point doses for Batho algorithm in different photon energy

Figure 3: (a) Difference between measured and calculated point doses for CS algorithm in different photon energy. (b) Difference between measured and calculated point doses for CCC algorithm in different photon energy. (c) Difference between measured and calculated point doses for FSC algorithm in studied energy

The results for the single square field test (case 1) were in compliance with agreement criteria (±2%) for points inside the plastic water for all of the TPSs. In lung out of field, type (a) algorithms show higher deviations that reflect the underestimation of the dose increasing with the beam energy. The mean differences for low and high energy are −2.7 ± 0.3% and −6.5 ± 0.6%, respectively, while the agreement criteria for this type is ± 4%. Type (b) algorithms that accounted the beam widening in the lung equivalent material meet however the agreement criteria. The results of this case in bone equivalent material (see point 10) were similar to results of point 9. As we saw, underestimation by 1%-4% for types (a) algorithms that increased with beam energy.

In the tangential fields (case 2), the differences between measured and calculated doses were within agreement criteria (±3%) for almost all tests and all studied TPSs. Also, the results of the blocked field test (case 3) were within ±3% for all systems.

The four-field box test (case 4) has three measurement points: (1) at the isocenter in plastic water (point 5), (2) in the lung equivalent material on the central axis of lateral beams (point 6), and (3) in the bone equivalent material on the central axis of vertical beams (point 10). The deviations outside agreement criteria were found for points 6 and 10 for some of the type (a) and (b) algorithms for all energy groups. According to [Table 3], the largest deviation in lung in the filed, for EPL, ETAR, Batho, CS, CCC, and FSC algorithms, was 8.7%, 7.8%, 10.7%, 6%, 5.6%, and 4.7% as overestimation, respectively. This deviation decreased with depth increasing. Also, like in test number 1, differences in lung out of field were out of agreement criteria for type (a) algorithms. The largest deviations for EPL, ETAR, and Batho algorithms were 5.1%, 5.9%, and 6.2% as underestimation, respectively.

Table 3: The maximum error out of agreement criteria for each algorithm in important points (accordingto tests number 1 and 4)

In addition, in bone equivalent material EPL, ETAR, Batho, and CCC have deviations out of agreement criteria that their maximum differences were -5.1%, -5.8%, 8.1%, and −6.5%, respectively. These deviations decreased with increasing in depth and energy decreasing.

All of the algorithms performed well in the irregular L-shaped field test (case 6) at prescription point 3, however, were failing inside the lung equivalent material for EPL and ETAR algorithms. Maximum differences for these two algorithms, at this point, were 11.2% and 11%, respectively. While differences increase with photon beam energy increases.

The largest deviation observed in this study was in case 6 and at point 10 for EPL algorithm, which are located within the bone equivalent material and below the shield. A difference up to 13% was discovered. Also, other type (a) algorithms have deviation out of agreement criteria, too. Amount of this deviation for ETAR and Batho algorithms was − 10.9% and 8.7%, respectively.

In the asymmetrically wedged field test (case 7), the differences between measured and calculated doses were within agreement criteria (±4%) for all systems. The calculated dose was in agreement criteria for the noncoplanar field test (case 8). Although deviations outside agreement criteria were also observed for the two studied algorithms (ETAR and CCC) for anterior (coach rotation) position for all energy groups. That maximum difference between measurement and calculation was 8%.

In general, for type (b) algorithms, the range of observed deviations between measured and calculated doses was within agreement criteria for almost all tests and all TPSs tested, while larger deviations were seen for types (a) algorithms.

[Table 3], [Table 4], and [Figure 4] summarize the results of different algorithm types and energy groups. As a general trend, the numbers of measurements with results outside the agreement criteria increase with the beam energy and decrease with the advancement of TPS algorithms.

In [Table 3], the maximum error out of agreement criteria for each algorithm is presented, according to tests numbers 1 and 4 (because these tests present the basic behavior of the inhomogeneity correction algorithms). While [Table 4] shows summary of results that are out of agreement criteria for every studied algorithm (for example when our agreement criteria is 3% and our result is 4%, the out of agreement criteria result is 1%). [Figure 4] shows percentage of measurements with results outside agreement criteria depending on algorithm type and energy. According to this figure, in type (a) algorithm, we saw 7.6% and 11.3% deviations out of agreement criteria for low- and high-energy photons, respectively. While in type (b) algorithms, these values were 4.3% and 5.1%, respectively.

Table 4: Summary of results that are out of agreement criteria for every studied algorithm*

One of the major contributions in TPSs is the accuracy of dose calculation algorithm. Therefore, it is important to perform various tests to understand the algorithm's limitations. Such tests aim to identify problems and decrease errors in overall patient treatment process. In this study, comparison of dose calculations algorithm in commonly used TPSs in Iran was evaluated using IAEA TEC-DOC 1583 protocol.

In results of EPL and ETAR algorithms calculation, we found up to 8.7% and 7.8% overestimation in lung and 5.1% and 5.9% underestimation in bone, respectively. These results are in compliance with Engelsman et al., [7] El Khatib et al., [9] and Van Kleffens and Mijnheer et al., [22] These differences are due to this fact that these algorithms have no distinction between dose of primary and scattered beams. When photon energy increased, the effect of this fact, increased too because calculations are based on electron equilibrium and not on photon scattering.

In Batho algorithm, we saw up to 10.7% and 8.7% overestimation in lung and bone equivalent materials, respectively. These results are in compliance with Wong and Henkelman [23] study. Basic limitation of this method is due to the assumption of lateral electron equilibrium which its effect increases with photon energy increasing.

About type (b) algorithms, due to modeling of photon energy spectrum and lateral electron scattering, results were better than in type (a) algorithms. In model-based algorithms, maximum differences were seen in lung equivalent materials that their differences from type (a) algorithms in this point were meaningful. Results of these algorithms were in compliance with results of Muralidhar et al., [24] Vanderestraeten et al., [25] and Asparadakis et al. [26] studies.

The accuracy of TPS calculations for external beam photon therapy has been the subject of extensive studies. [2],[7],[8],[14],[16],[20],[27] According to this study, a couple of general conclusions could be drawn. The systematic dose overestimation by types (a) and (b) calculation algorithms was observed for all measurement points located inside the lung equivalent material. Type (a) algorithms show larger differences in absolute values than type (b) algorithms, also many measurement points are with results outside agreement criteria. The magnitude of the error was related to the beam energy. Larger deviations are observed for higher beam energies which are in agreement with a study performed by Gershkevitsh et al. [3]

The results confirmed the inadequacy of the type (a) algorithms to manage the dose calculation in the presence of and inside low-density inhomogeneities especially at high-energy beams. Type (b) algorithms showed better performance in the applied test cases with most of the results being within specified agreement criteria due to their lateral transport modeling.

According to the results, the range of observed dose deviations can be used by the users of variant TPSs that use similar algorithms, as the relative check of their systems, although it should be kept in mind that many factors (dose calculation grid, inadequacies in input data, and etc.) may affect the final results. Also, it should be noted that all of studied algorithms were available for one TPS. Therefore, this result may not be applicable for other TPSs.

Some aspects such as penumbra widening in low-density material at higher energy beams (test case 1, point 9) or secondary source effect of shields at points under the shield (test case 6, point 10) would be better explored using film dosimetry which was not used in this study.

In some cases, errors were based on data entry problems like errors in wedge, tray, and shield transmission factors or uncertainty in CT curves. That we solved these problems as far as it was possible by perform all tests before main measurements. Unlike one system with CS algorithm overestimated the dose (4.5%) due to wedge commissioning data problems in test number 2, that we could not resolve it.

Conclusions

The methodology described in IAEA TECDOC 1583 [18] has been applied in five hospitals for the estimation of six inhomogeneity correction algorithms. The differences between calculated dose and measurement dose is presented and discussed. Large deviations exist in types (a) calculation algorithms. Especially in lung and bone equivalent materials, when they are in field, at high-energy beams and in low depths. Therefore, type (b) algorithms have been found to be preferable to simple models and thus should be implemented in clinical practice and gradually replace less accurate algorithms. This would allow a better consistency between reported and delivered doses. The tests that have been used in this study could help the users to appreciate the possibilities of their systems and to understand their limitations.

This research has been supported by National Radiation Protection Department (NRPD) of Iran Nuclear Regulatory Authority (INRA) and Tehran University of Medical Sciences and health Services Grant No. 11815-31-04-89.

American Association of Physicists in Medicine Report 85. Tissue inhomogeneity corrections for MV photon beams Report of Task Group No. 65 of the Radiation Therapy Committee of the American Association of Physicists in Medicine. Madison: Medical Physics Publishing, 2004.