TG-218 Aapm
TG-218 Aapm
TG-218 Aapm
Todd Pawlicki
Department of Radiation Oncology, University of California San Diego, La Jolla, CA, USA
Andrea Molineu
Radiological Physics Center, UT MD Anderson Cancer Center, Houston, TX, USA
Harold Li
Department of Radiation Oncology, Washington University, St. Louis, MO, USA
Krishni Wijesooriya
Department of Radiation Oncology, University of Virginia, Charlottesville, VA, USA
Jie Shi
Sun Nuclear Corporation, Melbourne, FL, USA
Ping Xia
Department of Radiation Oncology, The Cleveland Clinic, Cleveland, OH, USA
Nikos Papanikolaou
Department of Medical Physics, University of Texas Health Sciences Center, San Antonio, TX, USA
Daniel A. Low
Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, USA
(Received 7 July 2017; revised 10 December 2017; accepted for publication 11 January 2018;
published 23 March 2018)
Purpose: Patient-specific IMRT QA measurements are important components of processes designed
to identify discrepancies between calculated and delivered radiation doses. Discrepancy tolerance
limits are neither well defined nor consistently applied across centers. The AAPM TG-218 report
provides a comprehensive review aimed at improving the understanding and consistency of these
processes as well as recommendations for methodologies and tolerance limits in patient-specific
IMRT QA.
Methods: The performance of the dose difference/distance-to-agreement (DTA) and c dose distribu-
tion comparison metrics are investigated. Measurement methods are reviewed and followed by a dis-
cussion of the pros and cons of each. Methodologies for absolute dose verification are discussed and
new IMRT QA verification tools are presented. Literature on the expected or achievable agreement
between measurements and calculations for different types of planning and delivery systems are
reviewed and analyzed. Tests of vendor implementations of the c verification algorithm employing
benchmark cases are presented.
Results: Operational shortcomings that can reduce the c tool accuracy and subsequent effec-
tiveness for IMRT QA are described. Practical considerations including spatial resolution, nor-
malization, dose threshold, and data interpretation are discussed. Published data on IMRT QA
and the clinical experience of the group members are used to develop guidelines and recom-
mendations on tolerance and action limits for IMRT QA. Steps to check failed IMRT QA
plans are outlined.
Conclusion: Recommendations on delivery methods, data interpretation, dose normalization, the
use of c analysis routines and choice of tolerance limits for IMRT QA are made with focus on detect-
ing differences between calculated and measured doses via the use of robust analysis methods and an
e53 Med. Phys. 45 (4), April 2018 0094-2405/2018/45(4)/e53/31 © 2018 American Association of Physicists in Medicine e53
e54 Miften et al.: Tolerances and Methodologies for IMRT QA e54
in-depth understanding of IMRT verification metrics. The recommendations are intended to improve
the IMRT QA process and establish consistent, and comparable IMRT QA criteria among institu-
tions. © 2018 American Association of Physicists in Medicine [https://doi.org/10.1002/mp.12810]
Physicists in Medicine (AAPM) report on IMRT clinical To review commonly used measurement methods: com-
implementation described delivery systems and pretreatment posite of all beams using the actual treatment parameters,
QA.15 In 2009, additional details concerning IMRT commis- perpendicular composite, and perpendicular field-by-
sioning were addressed including tests and sample accuracy field. Discuss pros and cons of each method.
results for different IMRT planning and delivery systems.16 In To review single-point (small-averaged volume), 1D
2011, strengths and weaknesses of different dosimetric tech- and 2D analysis methodologies for absolute dose verifi-
niques and the acquisition of accurate data for commissioning cation with ion chamber and 2D detector arrays, mainly
patient-specific measurements were addressed.17 A compre- performed with dose differences comparison, distance-
hensive White Paper on safety considerations in IMRT was to-agreement (DTA) comparison between measured
also published, which clearly specified that pretreatment vali- and calculated dose distributions, and a combination of
dations were necessary18 for patient safety, but the goal of the these two metrics (c method).
White Paper was not to explicitly address how that validation To investigate the dose difference/DTA and c verifica-
should be done. Other possibilities besides measurements tion metrics, their use and vendor-implementation vari-
have been published including, independent computer calcu- ability, including the choice of various parameters used
lations, check-sum approaches, and log file analysis.14,19–24 to perform the IMRT QA analysis.
Several professional organizations [AAPM, American Col-
lege of Radiology (ACR), American Society for Radiation The objective of this report is to address these charges.
Oncology (ASTRO)]15,16,18,25 have strongly recommended The report provides recommendations on tolerance limits and
patient-specific IMRT QA be employed as part of the clinical measurement methods. Specifically, various measurement
IMRT process. A series of New York Times articles high- methods are reviewed and discussed. The dose difference/
lighted to the general public the hazard to patients when DTA and c verification metrics are examined in-depth. Data
patient-specific IMRT QA was not performed after a change on the expected or achievable agreement between measure-
to a patient’s treatment plan was made.26,27 ments and calculations for different types of planning and
While the value of patient-specific IMRT QA has been delivery systems are reviewed. Results from a test suite devel-
debated among physicists,19,28–30 especially whether compu- oped by TG-218 to evaluate vendors’ dose comparison soft-
tational methods can replace physical measurements, mea- ware under well-regulated conditions are presented.
surement-based patient-specific IMRT QA methods are Recommendations on the use of c analysis routines and
widely used and are the core element of most IMRT QA pro- choice of tolerance limits are made.
grams. In many centers, a QA measurement is routinely per-
formed after a patient’s IMRT plan is created and approved
1.B. Uncertainties in the IMRT planning and
by the radiation oncologist. The treatment plan consisting of
delivery process
MLC leaf sequence files (or compensators) as a function of
gantry angle and MUs from the patient’s plan is computed Acceptance criteria for initial machine and TPS commis-
on a homogeneous phantom to calculate dose in the QA sioning are well established.31,32 The acceptance criteria for
measurement geometry. The physical phantom is irradiated patient-specific IMRT QA, however, are more difficult to
under the same conditions to measure the dose. The calcula- establish because of large variations among IMRT planning
tions and measurements are compared and approved or systems, delivery systems, and measurement tools.33–36 There
rejected using the institution’s criteria for agreement. If the are many sources of errors in IMRT planning and delivery. In
agreement is deemed acceptable, then one infers that the terms of treatment planning, the error sources can include
delivered patient plan will be accurate within clinically modeling of the: MLC leaf ends, MLC tongue-and-groove
acceptable tolerances. This phantom plan does not check the effects, leaf/collimator transmission, collimators/MLC
algorithm’s management of heterogeneities, segmentation penumbra, compensator systems (scattering, beam hardening,
errors, or patient positioning errors. The details of methods alignment), output factors for small field sizes, head
used to evaluate the agreement between measured and calcu- backscatter, and off-axis profiles. They can also include a
lated dose distributions (e.g., how a c evaluation has been selection of the dose calculation grid size and the use and
implemented), however, are often poorly understood by the modeling of heterogeneity corrections. Accurate IMRT TPS
medical physicists. For example, if the tolerance limits have beam modeling is essential to reduce the uncertainties associ-
not been thoroughly evaluated, it will be difficult to assess ated with the TPS planning process and consequently ensure
with any degree of confidence that these limits were clini- good agreement between calculations and measurements
cally appropriate. To this end, the AAPM Therapy Physics when performing patient-specific verification QA.37,38
Committee (TPC) formed Task Group (TG)-218 with the Spatial and dosimetric delivery system uncertainties also
following charges: affect IMRT dose distribution delivery accuracy. These
uncertainties include: MLC leaf position errors (random and
To review literature and reports containing data on the systematic), MLC leaf speed acceleration/deceleration, gantry
achieved agreement between measurements and calcu- rotational stability, table motion stability, and beam stability
lations for fixed-gantry IMRT, VMAT, and tomotherapy (flatness, symmetry, output, dose rate, segments with low
techniques. MUs). In addition, differences and limitations in the design
of the MLC and accelerators, including the treatment head distribution plays an important role in its display and evalua-
design, as well as the age of the accelerator/equipment can tion. Coarse dose distributions may require some type of
have an impact on the accuracy of IMRT delivery tech- interpolation to display in an easily interpretable form, such
niques.37,38 as isodose lines or dose color washes. Dose distribution reso-
Another source of uncertainty among clinics using mea- lution also plays a role in dose distribution comparisons.
surement-based patient-specific IMRT QA programs is the Some comparison techniques are degraded by coarse resolu-
measurement and analysis tools used to interpret the QA tion, so interpolation is employed.
results.39–43 These software tools have several parameters that This discussion of dose comparison techniques will
must be chosen to perform the analysis and the results can assume that there are two distributions, termed a “reference”
vary significantly depending on those choices. One example is and an “evaluated” dose. The reference distribution is typi-
the selection of whether to use global or local dose normaliza- cally the one against which the evaluated distribution is being
tion to compare measured and calculated dose distributions. compared; although the specific mathematics and limitations
of the comparison techniques may require these roles to be
reversed. Some of the comparison techniques are invariant
1.C. Tolerances and action limits
with respect to selection of the reference and evaluated distri-
Quality measures are employed to validate system perfor- bution and some are not.
mance,44,45 such as IMRT QA. In this report, action limits The process of dose comparison is part of a clinical work-
are defined as the amount the quality measures are allowed flow, in which the goal is to determine if the reference and
to deviate without risking harm to the patient35 as well as evaluated dose distributions agree to within limits that are
defining limit values for when clinical action is required. An clinically relevant. The question of clinical relevance involves
example for IMRT QA is the decision not to treat the patient more than the dose itself, it also involves the dose gradients
if the comparison between a point-dose measurement and as well as dose errors resulting from spatial uncertainties.
the planned value exceeds a predefined acceptance criterion There is therefore a need to understand both the spatial and
(e.g., 5%). These limits will depend on whether one is dosimetric uncertainties when conducting dose distribution
using relative or absolute dose differences and/or explicitly comparisons. The spatial analog to the dose difference is the
excluding low-dose regions from the analysis. Action limits DTA, which in general refers to the distance between com-
should be set based on clinical judgment regarding the mon features of the two dose distributions.
acceptability of a specific quality measure deviation. The positional accuracy specification of a steep dose gra-
Tolerance limits are defined as the boundaries within dient region should at least in part be based on the accuracy
which a process is considered to be operating normally, of patient positioning. Setting IMRT QA tolerances tighter
that is, subject to only random errors. Results outside of than the ultimate clinical requirement will lead to unneces-
the tolerance limits (or trends moving rapidly toward these sary effort in attempting to reduce respective errors. Finally,
boundaries) provide an indication that a system is deviat- in some cases, the spatial uncertainty can be related simply to
ing from normal operation. The measurement results that experimental error. Even if the user insists on having zero
lie outside the tolerance limits should be investigated to spatial error in a calculation, or if the calculation is being
determine if their cause could be identified and fixed. The used for an extremely accurate dose delivery process, the
intent of this approach is to fix issues before they reach dose distribution measurements have some spatial uncer-
clinically unacceptable thresholds or action limits. When tainty. Therefore, the DTA criterion can also be partly defined
using action and tolerance limits, it is assumed that a care- based on the measurement error.
ful commissioning process was followed. During the com-
missioning process, systematic errors should be identified
2.A. Challenges for comparing dose distributions
and eliminated to the degree possible. This approach can
also inform the choice of action limits when ambiguity On the surface, comparing dose distributions would appear
exists about the clinical impact of exceeding action limits. to be straightforward. The distributions are no more than
This report will provide recommendations on one process- arrays of numbers, and a straightforward method for compar-
based method to choose these limits and accounts for both ing is to calculate their numerical difference. However, in
random errors and any residual systematic errors from the steep dose gradient regions, the dose difference is extremely
commissioning process. sensitive to spatial misalignments. This sensitivity leads to
large dose differences that easily exceed the dose difference
criteria even for clinical irrelevant spatial misalignments.
2. DOSE DIFFERENCE, DTA, c ANALYSIS, AND
A common method for comparing dose distributions is to
VERIFICATION METRICS
overlay their contours. This technique provides a rapid and
Dose distributions are almost always represented as arrays qualitative method for comparing the distributions. If the dis-
of points, each defined by a location and dose value. The tributions agree exactly, the contours will overlay and if not,
spacing between the points is the spatial resolution of the dis- they will separate. The separation distance will depend on
tribution and does not need to be the same in all spatial two factors; the difference in the doses and the local dose gra-
dimensions or locations. The spatial resolution of the dose dients. When the gradients are steep, contours move only
slightly with changes in dose, so even large dose errors will delivery errors, simply because the dose deposited in the
correspond to small contour separations. Therefore, compar- phantom has a different pattern than the dose deposited in
ing contours in steep dose gradient regions provides little the patient. Even if the spatial locations of the organs and
insight as to the dose differences because it takes very large tumor are superimposed on the phantom, the dose distri-
differences to significantly move the lines. On the other hand, butions will typically not conform to them because of the
even small dose differences will move isodose lines far in differences in the attenuation and scatter properties
low-dose gradient regions. The only places where contour between the phantom and patient. Therefore, for evalua-
plots easily provide quantitative information are where iso- tions such as patient-specific QA in phantom, we have
dose lines cross or superimpose. If the isodose lines are the conventionally relied on more generic acceptance criteria,
same values, the distributions agree exactly at those locations. based on overall goals of dosimetric and spatial accuracy
If two different isodose lines cross, for example the 50% line in the domain in which we are able to measure.
from one distribution and the 60% line from the other distri-
bution, the dose difference is known at the crossing point.
2.B. Dose difference test
Otherwise, superimposed isodose contours provide little
quantitative information. The dose difference test is the most straightforward test to
Figure 1 shows an example of superimposed isodoses understand and interpret. The dose difference at location ð~ rÞ
from Brulla–Gonzalez.46 The isodoses represent measured is the numerical difference d between the evaluated dose
dose distributions from radiochromic film and a 2D De ð~
rÞ and the reference dose Dr ð~ rÞ at that location. Mathe-
dosimeter. The correspondence between the two dose dis- matically, the dose difference can be written as
tributions is clear, and shows that there is no extensive
dð~
rÞ ¼ De ð~
rÞ Dr ð~
rÞ
disagreement, although disagreements of a few percent are
difficult to determine. Figure 2 shows two dose distribu- Note that the doses are sampled at the same positions.
tions that greatly disagree.47 The fact that they disagree is This analysis is straightforward when the dose distribution
instantly clear from the vastly different isodose lines. In elements occupy the same locations (i.e., same grid resolu-
this case, additional quantitative dose analysis may be tion), but spatial interpolation is required when they do not.
unnecessary. The dose difference test is invariant to within a sign with
One of the more challenging aspects of phantom-based respect to the selection of the reference and evaluated distri-
dose distribution comparisons is the difference between the butions; all that happens if they are swapped is that the sign
phantom and patient doses due to their differing geome- of the dose difference changes.
tries.48 For measurements intended to evaluate planned The dose difference test is excellent at providing the user
dose accuracy, the comparison criteria would ideally be insight as to the concordance between the two distributions in
based on clinical organ-by-organ tolerances. For example, low-dose gradient regions. In these regions, the dose changes
the tumor dose tolerance specification might be 3%, while slowly with location, and the dose difference is indicative of
a looser criterion of 10% might be acceptable to some
muscle receiving 10 Gy and similarly with spatial toler-
ances. The spatial accuracy requirement might be 2 mm at
the edge of the spinal cord, but 5 mm or more in the
muscle. Because the measurements are conducted in phan-
toms, the planned fluence distribution does not lead to the
clinical dose distribution, even if there are no planning or
10
0.9
8
6 0.8
4
0.7
2
Y (mm)
0.6
0
−2 0.5
−4
0.4
−6
0.3
−8
−10 0.2
−15 −10 −5 0 5 10 15
X (mm)
FIG. 1. Isodose overlay of two measurements, the solid line from radiochro- FIG. 2. Superimposed isodose distribution for two different dose distribu-
mic film and the dashed line from liquid-filled ionization chambers. From tions. The fact that the distributions disagree is clear from the intersecting
Brulla–Gonzalez.46 The 20%, 50%, 70%, 80%, 90%, and 95% dose levels isodose lines, but a quantitative evaluation of the discrepancy by eye is
are shown. impossible using this type of display. Image is from Duan et al.47
FIG. 3. (a) Film and calculated dose distributions from Bogner et al.49 (b) Dose difference distribution (percent of prescription dose) showing that large dose dif-
ferences can occur in steep dose gradient regions, even for dose distributions that are otherwise similar. (c) c distribution based on 3% dose difference and 3 mm
DTA.
DTA distributions are difficult to interpret and by themselves the comparison (e.g., a film plane contains two spatial and
not very useful. one dose dimension). The dose distribution could be thought
Because the DTA test involves a search, the DTA value of as an n-dimensional sheet within the n+1 dimensional
is not invariant to the selection of which distribution is space. The problem with determining a displacement in that
selected as the reference. The reference distribution can space was that one of the axes was dose, while the others
have any resolution and dimensionality because the DTA were distance. A displacement measurement was meaningless
is calculated point by point in the reference distribution, in this multiple-quantity space.
but the evaluated distribution usually has at least the same In order to allow the measurement to be defined, the dose
or greater resolution and dimensionality than the reference and displacement scales were renormalized to be unitless by
distribution. dividing them by the dose (DD) and DTA (Dd) criteria,
respectively.
The displacement between two points, ~ rr and ~
re in the ref-
2.D. Composite test
erence and evaluated distributions, respectively, in the renor-
Given that the dose difference and DTA tests were com- malized space was termed c,
plementary in their sensitivity to low and steep dose gradient sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
regions, respectively, it made sense to combine the two, so r 2 ð~ re ; r r Þ d2 ð~ re ; r r Þ
one could determine if a reference point had passed both the Cð~re ;~
rr Þ ¼ 2
þ : (1)
Dd DD2
dose difference and DTA tests. Harms et al.50 termed this the
composite test. A reference point was said to have passed if it where r ð~
re ; r r Þ was the distance between the reference and
passed either the dose difference or DTA tests. Only if it evaluated points, and dð~ re ; r r Þ was the dose difference. The
failed both tests was it determined that it had failed the com- minimum displacement was defined as c
parison. While the composite test automatically managed rr Þ ¼ minfCð~
cð~ rr Þg8f~
r e ;~ re g (2)
both steep and low-dose gradient regions, it suffered from
being strictly a pass-fail test. If a point failed, the test did not Values of c between 0 and 1 indicated that the comparison
indicate by how much. passed with respect to the dose and distance criteria. Values
greater than 1 indicated failure. Because c was the displace-
ment between the two distributions, c was essentially the
2.E. c test
radius between the reference point and the evaluated distribu-
The lack of insight as to the magnitude of failure with the tion, so the pass/fail criteria were essentially the circle,
composite test led Low et al.43,51 to generalize the test. They sphere, or hypersphere in 1, 2, and 3 dimensional dose distri-
treated the dose distribution comparison from a geometric bution comparisons, respectively. This was similar to the
perspective by evaluating the displacement between the refer- composite test, and in fact comparisons between the two tests
ence and evaluated distributions. This evaluation was con- showed little in the way of differences between points that
ducted independently for each reference dose point. Similar passed and failed, although the c test was shown to be
to the DTA test, the dimensionality of the reference distribu- slightly more forgiving than the composite test for clinical
tion could be a single point, while the evaluated distribution dose distributions.43
needed to be at least one dimensional. While the c test provided more than a pass/fail compar-
The question of the displacement between dose distribu- ison, this did not itself allow a straightforward interpretation
tions was complicated by the fact that there were n+1 degrees of the test’s meaning. The most effective method of gaining
of freedom, where n referred to the spatial dimensionality of an intuitive understanding of the test’s performance was to
Evaluated
Evaluated Distribution
Distribution
Distance Distance
Reference Reference
Point Point
FIG. 4. Examples of one-dimensional c dose comparison analyses in low (a) and steep (b) dose gradients. The closest approach that the evaluated distribution
makes to the reference point is c. For low-dose gradients, the c test is essentially the dose difference test. For steep dose gradients, the c test is essentially the
DTA test.
examine how the test behaved in two extreme conditions, one less than 75% of the maximum calculated dose, the pixel
of near zero dose gradient and one of steep dose gradients. was assumed to be outside the PTV, and if the measured dose
Figure 4 shows examples of 1D dose distributions with was less than the computed dose, the error was assumed to
low and steep dose gradients. The c calculation found the have no biological significance so the NAT was again set to
closest approach of the evaluated dose distribution to the ref- zero. If, on the other hand, the measured dose was greater
erence dose distribution. With low-dose gradients, the vector than the computed dose, or if the percent dose was greater
connecting the reference point to the evaluated distribution than 75%, the NAT value was computed as Dscale (d 1),
lies nearly parallel to the dose axis [Fig. 4(a)]. The dose dif- where d was the lesser of the ratios ABSðDD=DDm Þ or
ference test could be interpreted as the distance between the Dd=Ddm , and Dscale was the greater of the computed or mea-
two distributions along the dose axis, which is what the c test sured dose at the pixel of interest divided by the maximum
defaulted to in the conditions of a zero dose gradient. There- computed dose.
fore, the c test defaulted to the dose difference test (within Bakai et al.53 developed a dose distribution comparison
the normalization of the dose difference criterion) in low- tool based on gradient-dependent local acceptance thresh-
dose gradient regions, precisely where the dose difference olds. The method took into account the local dose gradients
test was most useful. of a reference dose distribution for evaluation of misalign-
Figure 4(b) shows the c test under conditions of steep ment and collimation errors. These contributed to the maxi-
dose gradients. In this case, the c vector lies nearly parallel to mum tolerable dose error at each evaluation point to which
the distance axis, or distance axes for 2D and 3D dose distri- the local dose differences between comparison and reference
butions. The DTA test could be interpreted as the closest data were compared. They identified two weaknesses of the c
point where the evaluated dose distribution crossed the dis- test that they were addressing. First, that with an exhaustive
tance axes (with the origin placed at the reference point). search, the c tool would take a considerable amount of time
Therefore, as the gradient increased, the c test became the to calculate, especially for 3D dose distributions. Second,
DTA test as normalized by the DTA criterion. interpolation would be required if the dose distribution spac-
The main benefit of the c comparison tool was that it auto- ing was insufficiently fine. They concluded that the search
matically reduced the sensitivity of dose distribution compar- process inherent in the c evaluation should be avoided and so
isons in steep dose gradient regions. Figure 3(c) shows an they defined an alternative process.
example of c for the dose distribution comparisons shown in First, the dose axis was rescaled to units of distance by
Fig. 3(b). multiplying the dose by the ratio of the DTA to dose differ-
ence criteria. Without interpolation, for relatively large grid
spacings and steep dose gradients, the tool overestimated the
2.F. Other tools
value of c. The difference between the evaluated and refer-
A number of IMRT QA evaluation tools have been devel- ence doses was divided by a quantity called s0 which was
oped and reported in the literature.52–54 The gradient com- related to the local dose gradient, resulting in the dose distri-
pensation method was developed by Moran et al.52 They bution comparison tool v. They compared dose distributions
computed the local dose gradient for each point in the dose calculated using the v and c tools and showed both methods
distribution. A user-selected distance parameter was chosen gave essentially the same results, however, the calculation of
to allow for geometric uncertainties due, for example to v was more efficient than the calculation of c.
experimental error or calculation grid resolution. The dose
gradient at each point was multiplied by this distance parame-
2.G. Practical considerations
ter to yield a dose value corresponding to the resulting uncer-
tainty in dose due to the spatial uncertainty. Dose differences Users of the c tool should understand its performance in
in excess of this uncertainty would be displayed and ana- some detail. While the mechanics of the calculation are rela-
lyzed. The gradient compensation method would remove tively straightforward, there are operational details that can
dose differences that might be due to the spatial uncertainty. reduce its effectiveness and accuracy.
Presumably, the remaining differences would not be due to
spatial errors and the physicist could evaluate the magnitude
2.G.1. Normalization
and clinical relevance of those errors.
The normalized agreement test (NAT) values and NAT Normalization plays a critical role in the interpretation of
index were defined by Childress and Rosen.54 The NAT dose comparison results. The dose difference criterion is a
index represented the average deviation from the percent dose case in point. The dose difference criterion is typically
difference (DDm) and DTA criterion (Ddm) for every pixel described as the percentage of the maximum dose for one or
calculated, ignoring measurement areas having errors less both of the dose distributions being compared (global nor-
than a set criterion. They developed an algorithm that started malization), or the percentage of the prescription dose. It also
with computing the dose difference and DTA values. If the can be described as a local dose percentage (local normaliza-
dose difference for a point was less than the criterion, the tion). Specifically in global normalization, the dose differ-
NAT value was set to zero. If the DTA value was less than the ence between any measured and calculated dose point pair is
criterion, the NAT was set to zero. If the calculated dose was normalized using the same value for all point pairs, often the
1 1 1
γ Actual γ
-1 1 -1 Actual γ 1 -1 Calculated γ 1
Distance/Δd Calculated γ Distance/Δd Distance/Δd
Reference
Point
-1 -1 -1
FIG. 5. Example of the c calculation error when the evaluated dose distribution spatial resolution is relatively coarse with respect to the DTA criterion. (a) The
calculation is correct. (b) The calculated value is greater than what would be calculated if interpolation was used. (c) Evaluated dose distributions with low-dose
gradients can have the same error if the evaluated pixel locations differ from the reference pixel.
The DTA criterion can be used to allow for measurement posi- difference/DTA criteria, e.g., 4%/3 mm, 3%/3 mm, 3%/
tioning error, for example positioning a phantom to the lasers, 2 mm, 2%/3 mm, and 2%/2 mm . . .etc., can also help the user
and the ability to position film within a phantom. understand the sources of discrepancies and their impact.
While the ideal method for setting the dose difference and Without dose comparison statistics specific to tumor or
DTA criteria would be organ and dose specific and based on organ systems, one needs to remember that any histogram or
heterogeneous phantoms mimicking patient geometries, this statistical analysis neglects the spatial information. It should be
is not currently practical. One simple way of managing the noted that the c test could underestimate the clinical conse-
current standard of practice is to apply dose thresholding for quences of certain dose delivery errors when the entire dose
the c analysis. The doses smaller than a user-selected value distribution is evaluated together. This was demonstrated in
are not included in the c or other analyses. This allows the the work of Nelms et al.48 where the specific dose delivery
user to focus on greater, clinically relevant doses. error they evaluated caused the high-dose regions to be deliv-
Two-dimensional dose distributions can have thousands of ered outside of the c tolerance while the dose delivery accu-
points being evaluated (a typical 20 9 20 cm2 film scanned racy for lower doses was within tolerance. This correlation of
with a 0.5 mm2 resolution yields 160,000 points). There are a dose error and dose meant that the errors were clustered in the
few ways of reviewing the resulting comparison data. The high-dose region, which corresponded to the tumor. A compar-
dose difference and c distributions can be presented as iso- ison of the dose-volume histograms clearly showed a system-
maps or colorwashes. The c distribution can also be summa- atic discrepancy between the dose distributions in the target,
rized by a c histogram. while the c statistics (relative numbers of points with c > 1)
The pass/fail criteria are selected in advance of the c calcu- appeared to be clinically acceptable. Another reason for the
lation, but care should be taken when reviewing the c results. acceptable c statistics was that the ratio of tumor to irradiated
In many cases, some points will not pass the c test and using volumes was very small, so even if the tumor is incorrectly
the exact number that can pass as the sole determinate of qual- irradiated, the fraction of points that failed the c test might also
ity is challenging and may not yield clinically useful results. A be small. This highlights the fact that: (a) c statistics should be
point that fails may not indicate a severe problem. Most of the provided in a structure by structure basis and (b) the c distribu-
time, the c criteria are fairly strict. One of the advantages of tion should be reviewed rather than relying only on distilled
the c distribution is that it provides an indication not only that statistical evaluations such as c histograms.
the point failed, but by how much. A c value of 1.01 is indica-
tive of a failure but for a 3% dose difference and 3 mm DTA,
3. REVIEW OF MEASUREMENT METHODS
a c value of 1.01 could indicate a dose difference of 3.03% in
a low-dose gradient region or a DTA of nearly 3.03 mm in a Several methods can be used to perform pretreatment
steep dose gradient region. Both of these are examples of fail- patient-specific IMRT verification QA measurements as
ures, but failures that exceed tolerances by 0.03% and shown in Fig. 6. The most common methods in clinical prac-
0.03 mm in the low and steep dose gradient regions, respec- tice are: (a) true composite (TC), (b) perpendicular field-by-
tively. A point that fails the c test by 0.03% or 0.03 mm needs field (PFF), and (c) perpendicular composite (PC). For each
to be considered differently than a point that fails by a substan- of these methods, the patient’s plan is recomputed onto a
tially wider margin. Therefore, the user should look not only at phantom that exists both physically and within the planning
the percentage of points that fail, but also make an analysis of system. The TPS calculates the dose in the phantom (hybrid
the maximum c value, the percentage of points that exceed a c plan) in the same geometry as for the subsequent measure-
value of, say 1.5, the c histogram and possibly other statistical ments.56 In a survey on planar IMRT QA analysis with 2D
values. Examining c calculations with different dose diode array devices, Nelms and Simon57 found that 64%
respondents reported using the field-by-field method and Detector arrays designed for perpendicular irradiation
58% performed absolute dose analysis. The survey results have been used to integrate the dose during the TC irradia-
showed that 76.3% of the clinics used 3%, 2.9% used 4%, tion. Additional phantom material surrounding the array has
and 15.1% used 5% dose difference limits. In addition, it was been used to obtain at least 5 cm depth in all directions.
noted that 82.7% used 3 mm, 5.0% used 4 mm and 2.6% Since the 2D arrays are designed for perpendicular measure-
used 5 mm DTA limits. They reported 34.5% used 0–5%, ments, the array detector’s radiation response is angularly
36.7% used 5–10% and 28.8% used ≥10% standard lower dependent. This angular dependence is caused by beam atten-
threshold dose limits. In the following sections, each method uation from internal electronics, device encapsulation materi-
is described along with the types of data that are obtained. als, diode detector packaging materials, and air cavities.63–65
Further, the pros and cons of each method are discussed. For The diode array may have significant angular dependence
all methods described below, the recommendations of TG- within 10° of the horizontal axis, primarily due to air cavi-
12017 can be utilized for dosimetric methods. ties between detectors.64,65 The angular dependence may be
smeared out when using beams from many angles, such as
with VMAT delivery. However, caution should be taken when
3.A. True composite (TC)
using 2D arrays for IMRT QA when more than 20% of the
The TC method simulates the treatment delivery to the dose comes from the lateral direction.64 Another limitation
patient. The radiation beams are delivered to a stationary for diode or ionization chamber arrays comes into play for
measurement device or phantom on the couch using the non-coplanar beams, which can irradiate the active electron-
actual treatment parameters for the patient, including MUs, ics of the device for certain field sizes and beam angles.
gantry, collimator, couch angles, jaws, and MLC leaf posi- Compared to film dosimetry, these ion chamber/diode arrays
tions. The method has been used most often by physicists per- have much lower spatial resolution. This becomes a disadvan-
forming film dosimetry although more recently, diode and tage in measuring doses of very small tumor volumes or with
chamber array devices have been used. Typically with film steep dose gradients, as well as commissioning IMRT sys-
dosimetry, an ion chamber (IC) is placed inside the phantom tems. It is important to note that with arrays, unlike films, an
and irradiated along with one or more sheets of ReadyPack independent IC reading is not essential because the QA anal-
EDR2 (Eastman Kodak Company, Rochester, NY, USA) or ysis can be performed in absolute dose mode.
Gafchromic EBT film (International Specialty Products, The measurement plane for a film or array can be placed
Wayne, NJ), providing simultaneous measurements of abso- either inside the high-dose volume or in a plane that samples
lute IC dose and relative planar doses58–62 [Fig. 6(a)]. The doses received by a particular critical structure. A common
measured IC reading can be converted to dose by taking the position for film is immediately anterior or posterior to the ion
ratio of a reference field reading to a known dose (e.g., chamber. The mean dose to the ion chamber volume (the
10 9 10 cm2 at reference depth) in phantom. chamber volume is contoured in the planning system phantom)
The film or detector array is usually positioned in a coronal as well as the 2D dose distribution in the same plane(s) as the
orientation on the couch [Fig. 6(b)] but can be in a sagittal film or array is calculated. The percent relative differences
orientation [Fig. 6(c)] (or transverse plane for film) or a between the measured and calculated chamber doses are then
rotated plane. Because the recorded doses are from all the compared to the acceptance criteria. The film dose image is
beams in the plan at their planned positions, the dose distribu- registered to the 2D TPS dose image using pin pricks on the
tion mimics the dose distribution inside the patient, distorted film or other fiducial marks which relate the film to the linac
and modified only by the difference between the patient and isocenter while the array dose image is aligned to the planning
phantom external contours and a lack of heterogeneities in the system dose image by deliberate positioning of the isocenter
phantom. Within the film or detector array, uniform high-dose relative to the array origin. After registration, an isodose over-
regions will be present along with similar dose gradients and lay and/or c analysis is performed.8,66 If an IC is used, it is
low-dose regions that occur in the patient’s plan. often placed at the isocenter when it lies in a uniform high-
F4 F4 F4
F4
F3 F3 F3
F3
F2 F2 F2
F2
F1 F1 F1 F1, F2, ... , FN F1
or Fn
FIG. 6. (a) True composite (TC) delivery on a phantom with an IC placed at a specific depth and a radiographic film at a coronal orientation. (b) TC delivery on
a stationary 2D array device placed in the coronal direction on the treatment table. (c) TC delivery on a stationary 2D array device placed in the sagittal direction
on the treatment table. (d) Perpendicular field-by-field (PFF) or perpendicular composite (PC) delivery on a stationary 2D array device placed in the coronal
direction on the treatment table. (e) PFF or PC delivery on 2D array device mounted on the treatment head.
dose region. If the isocenter lies in a nonuniform dose region, another beam. PFF may be more stringent because the dose
the IC can be placed away from the isocenter in a more uni- distribution from each beam is so highly modulated, small dif-
form dose region. Note that an ion chamber reading alone ferences in dose and its location can cause large differences in
without a 2D dose plane measurement is not sufficient for the analysis result. To a greater extent than for TC, the agree-
detecting errors other than at that single point.39 ment between the calculations and measurements is more
dependent on the normalization values for relative dose analy-
sis or shifts from the initial registration (Table I). In addition,
3.A.1. Advantages and disadvantages
the significance of a summation of discrete dose errors in each
There are three main advantages to the TC method. The first beam image, which commonly occurs, is not generally known.
is that the measurement includes inaccuracies of the gantry, In this situation, analysis results can be misleading as sug-
collimator, couch angles, and MLC leaf positions with gantry gested by a number of studies which have found a poor to
angle (gravity effects) as well as the attenuation of the couch moderate correlation between field-by-field 3%/3 mm DTA
top. The second advantage is that the resulting planar dose dis- results and the actual measured to calculated 3D dose differ-
tribution is closely related to the dose that will be delivered to ences in the patient or phantom.48,67–70 While these findings
the patient, so that the relationship between the high-dose may cast doubt on the value of these measurements, it empha-
region and organs at risk lying in the same plane can be sizes that the method and results should be carefully inter-
assessed. Third, there is only one dose image to analyze (per preted. It also emphasizes that clinical interpretation of QA
plane of interest). The main disadvantage is that portions of failure results is a challenging process.
many beams will not traverse the film or detector plane. This is
particularly true if the film/detector plane is transverse and irra-
diated by only one pair of MLC leaves. Thus, not every part of 3.C. Perpendicular composite (PC)
every beam is sampled. However, detector devices designed to The PC method is similar to the PFF method described
measure VMAT beams such as ArcCheck (Sun Nuclear Cor- above in (b), except that the dose is integrated for all the per-
poration, Melbourne, FL, USA) or Delta4 (ScandiDos, Upp- pendicular fields, resulting in a single dose image for analysis
sala, Sweden) generally sample the entire beam area. [Fig. 6(d)], and making the method faster than PFF. The
same measuring equipment and analysis methods are used.
3.B. Perpendicular field-by-field (PFF)
In this method, the gantry is fixed at 0 degree (pointing 3.C.1. Advantages and Disadvantages
down) for all beams and the collimator is fixed at the nominal The advantage of the PC method is that all portions of each
angle [Fig. 6(d)]. PFF is used most often with diode or cham- beam are incorporated in a single image. EPIDs can be used if
ber arrays although film and EPIDs have been used as well. each beam’s dose image is acquired separately and then added
Gantry mounting fixtures are available for some arrays, so together later. Using the EPID to obtain an integrated image
that the actual gantry angle can be used during irradiation to for VMAT is considered PC. The disadvantage is that the
include the gravity effects on the MLC leaves [Fig. 6(e)]. The method may mask some dose delivery errors, such as those in
TPS calculates the dose to the same plane as the measure- the scattered regions, and dose errors from any one beam,
ment detector and that dose plane is registered to the mea- within the composite, may be obscured by the superposition of
sured dose image using pin pricks or other fiducial marks in the other beam doses.71 Further, for VMAT delivery the dose
the case of film or by aligning the array dose image to the rate variation vs. gantry may be obscured using this method
planning system dose image by their common center in the and errors caused by using uniform dose rate delivery vs.
case of 2D arrays. A comparison of the planned vs. measured nonuniform dose rate delivery were not caught using PC. The
dose is then performed for each field. These analyses can be
performed in absolute dose mode, so that an independent IC TABLE I. For field-by-field c analysis based on relative dose, the passing rate
is highly dependent on the location of the point of normalization. This table
reading is not needed. Isodose and profile overlays are also
from commercial software system shows passing rates based on either points
used to compare the dose distributions. When an IC is used, that maximize the passing rate, the central axis point, or the maximum dose
similar to TC delivery, the chamber is typically placed at the point.
isocenter in a uniform high-dose region.
Normalization point X,Y
coordinates (mm) Pass Fail %Passing
3.B.1. Advantages and disadvantages
25,25 263 3 98.9
The advantage of the PFF method is that it samples every 20,30 259 7 97.4
part of every field as the dose from each of the IMRT fields is 45,5 248 18 93.2
delivered and analyzed separately. Field-by-field analysis may 30,30 251 19 93.0
reveal some subtle delivery errors. It prevents the dose wash- 25,35 244 20 92.4
out that can occur in a composite measurement geometry 0,0 (CAX) 221 79 73.7
when under-dosing in some areas in part of one beam may be 25,35 (Max) 244 20 92.4
compensated by an increased dose in the same region by
cone,91,92 to reconstruct the 3D dose in the patient. The mea- measured in a high-dose region with a low-dose gradient,40
surement data can be from an EPID,93 2D diode or 2D cham- (b) absolute planar dose, either normalized to a locally calcu-
ber arrays.91,94,95 In essence, these implementations use the lated point dose, the global calculated maximum dose of the
patient-specific beams that are then delivered to the detector, plane, or the maximum dose of the entire treatment plan,17
typically in the absence of a phantom or patient. The mea- (c) a DTA measure,32,50 (d) c analysis.43,51
sured data are corrected for the response of the detector and The reported IMRT QA agreement between measurements
subsequently used as the input fluence map to a forward dose and calculations, including absolute dose agreement and c
calculation in the patient anatomy. This operation requires an passing rates for various tolerance limits, is summarized in
independent dose calculation platform91,92 but can also be Table II. Table II shows absolute dose measurements with ICs
done using the TPS itself,41 thus removing any ambiguities in agreeing with expected values to within 5% and 2D measure-
the analysis that may result from differences in dose calcula- ments having c pass rates > 90% using 3%/3 mm DTA (glo-
tion algorithms between the TPS and the independent calcu- bal normalization).102,103 Furthermore, recent studies have
lation platform. However, using the TPS dose calculation reported nearly 100% passing rates for 3%/3 mm64,104–106 and
algorithm for both calculations may mask out beam modeling > 95% for 3%/2 mm or 2%/2 mm105,106 in moderately/com-
errors or algorithm limitations for IMRT beams. plex modulated plans.
treatment planning systems and measured with a 2D diode acceptable VMAT treatment plans were calculated in a phan-
array using 6MV photon beams. They found for relative tom using a Monte Carlo (MC) dose calculation algorithm and
doses that the average passing rate using 3%/3 mm with 10% actual delivery log files. Measurements using a Farmer ioniza-
dose threshold criteria for prostate and other cases was tion chamber (PTW-Freiburg, Freiburg, Germany) with an
99.3% 1.41% and for HN cases was 96.22% 2.89%. active volume of 0.6 cc agreed with both MC and TPS calcula-
For absolute point doses, they found that the average percent- tions to within 2.1%. Analyzing the detailed machine log files,
age dose error for prostate and other cases was they also confirmed that leaf position errors were less than
0.419% 0.42% and for HN cases was 1.41% 1.1%. The 1 mm for 94% of the time and there were no leaf errors greater
differences between the prostate/other cases and HN cases than 2.5 mm. The mean standard deviations (SDs) in MU and
were statistically significant. gantry angle were 0.052 MU and 0.355°, respectively, for the
ten cases they analyzed. This study demonstrated that accurate
VMAT delivery and stable machine performance were achiev-
5.A.2. Volumetric modulated arc radiotherapy
able. As a result, expectation of good agreement between pre-
Partly due to its decreased delivery time compared against dicted and measured dose for a VMAT delivery is warranted.
static gantry IMRT, VMAT is becoming the preferred tech- Many investigators have reported on experimental valida-
nique for IMRT delivery. In Teke’s116 study, ten clinically tion of VMAT delivery using various 1D, 2D and 3D
TABLE II. IMRT QA measurement results reported in the literature. Results include absolute point-dose agreement and c passing rates for various tolerance
limits.
Dong 200375 Fixed-gantry and IC 751 cases and 1591 0.37% 1.7% ( 4.5% to 9.5%)
serial tomotherapy measurements
Both 2007102 Fixed-gantry 2D Diode array 747 fields 3%/3 mm relative: 96.22% 2.89% (HN), 99.30% 1.41%
(prostate and other sites); absolute point dose error:
1.41% 1.10% (HN), 0.419% 0.420% (prostate and other
sites)
Ibbott 200833 Not specified Film, TLDs 250 (multi-institution) 179 (72%) pass (7%/4 mm absolute/global)
Molineu 2013107 Not specified Film, TLD 1139 irradiations, 763 929 (81.6%) pass (7%/4 mm absolute/global)
institutions
Basran 2008108 Fixed-gantry 2D diode array 115 plans 3%/3 mm absolute/global: 95.5% 3.5% (HN),
98.8% 2.0% (GU), 97.3% 1.6% (lung)
Ezzell 200916 Fixed-gantry and Film, IC, 2D 10 institutions, 5 High-dose point: 0.2% 2.2%; low-dose point:
Tomotherapy diode array from-easy-to-difficult 0.3% 2.2% (composite); per-field: 97.9% 2.5% (3%/
cases per institution 3 mm absolute/global); composite film: 96.3% 4.4% (3%/
3 mm absolute/global)
Geurts 2009109 Tomotherapy 3D diode array 264 plans 3%/3 mm: 97.5%, range 90.0–100%; absolute/relative or global/
local not indicated
Langen 2010110 Tomotherapy IC, planar TG-148 member IC: 3%; planar: >90% (3%/3mm absolute/global); range or SD
dosimeter institutions not given
Masi 201164 VMAT IC, film, 2D diode 50 plans IC: 1.1% 1.0%; electronic planar: >97.4% (3%/3 mm or 3%/
array, 2D IC array 2 mm absolute/both global and local), range 92.0–100%;
EDR2: 95.1%, range 83.0–100%; EBT2: 91.1%, range 80.0%–
98.5%
Bailey 2011103 Fixed-gantry 2D diode array, 25 prostate fields, 79 2%/2 mm absolute/global: 80.4% (prostate), 77.9% (HN); 2%/
EPID HN fields 2 mm absolute/local: 66.3% (prostate), 50.5% (HN); 3%/3 mm
absolute/global: 96.7% (prostate), 93.5% (HN); 3%/3mm
absolute/local: 90.8% (prostate), 70.6% (HN)
Lang 2012104 Fixed-gantry or IC, Film, 3D 224 plans (52 plans 99.3% 1.1% (3%/3 mm absolute/global); point dose: 0.34%
VMAT with FFF diode array, 2D IC with IC) (2% for 88% of cases)
array
Mancuso 2012105 Fixed-gantry and IC, Film or 2D TG-119 test cases IC: 0.82% 0.48% (IMRT) and 1.89% 0.50%
VMAT diode array (VMAT); Film: 97.6% 0.6% for IMRT, 97.5% 0.8% for
VMAT (2%/2 mm composite, absolute/global); Diode:
98.7% 0.3% for IMRT and 98.6% 0.4% for VMAT (3%/
3 mm absolute/global)
Bresciani 2013106 Tomotherapy 3D diode array 73 plans Absolute global: 98% 2% (3%/3 mm), 92% 7% (2%/
2 mm), 61% 11% (1%/1 mm); absolute local (2 cGy local
threshold): 93% 6%(3%/3 mm), 84% 9%(2%/2 mm),
66% 12%(1%/1 mm)
the total number of segments limited to 50 and five complex devices performed very poorly in terms of identifying unac-
plans generated using conventional two-step optimization with ceptable plans.
≥ 100 segments. For a 1 mm systematic error, the average These studies highlighted the importance of adopting tigh-
changes in D95% were 4% for simple plans vs. 8% for com- ter tolerances, performing a thorough analysis, having pro-
plex plans. The average changes in D0.1cc of the spinal cord grams for routine QA of the accelerator and MLC, as well as
and brain stem were 4% in simple plans vs. 12% in complex developing new methods to supplement measurement-based
plans. They concluded that for induced systematic MLC leaf patient-specific QA.70,132 In addition, these studies high-
position errors of 1 mm, delivery accuracy of HN treatment lighted the challenges of using c test passing rates for evaluat-
plans could be affected, especially for highly modulated plans. ing treatment plan acceptability and showed that clinical
Kruse investigated the sensitivity of the fraction of points analysis of IMRT QA failure results is a challenging task. As
that failed the c test with 2%/2 mm and 3%/3 mm criteria discussed in Section 2.G.3., the c test could underestimate
using EPID and ionization chamber array measurements.67 He the clinical consequences of certain dose delivery errors
used three HN treatment plans and created a second set by when the c test is summarized in aggregate and when more
adjusting the dose constraints to create highly modulated deliv- detailed examination of the c distribution is not conducted.
ery sequences. The treatment plans were computed on a cylin- The c test passing rate summarization has no spatial sensitiv-
drical phantom, and EPID and ionization chamber array ity, similar to dose-volume histograms, and the location and
measurements were acquired and compared against calcula- clustering of the failed points is not considered along with
tion. They found that the highly modulated plans with aggres- the passing rate. Also, field-by-field evaluation and dosimet-
sive constraints had many points that differed from the ric comparison might obfuscate clinically relevant dose errors
calculations by more than 4%, with one point differing by and make correlating test results with clinical acceptability
10.6%. Using the ionization chamber array, the fraction of difficult. This is especially important because most compar-
points that passed the 2%/2mm criteria were between 92.4% to isons are unable to reach 100% passing and so clinical criteria
94.9% for the original plans and 86.8% to 98.3% for the highly allow a fraction of points to fail the c test. IMRT QA evalua-
modulated plans. Similar results were found using the 3%/ tion of plans that have large regions of low dose cause the
3 mm criteria with the same ionization chamber array. They fraction of failed points to appear small even when the area of
concluded that the fraction of pixels passing the c criteria from failed points is large compared to the high-dose regions, and
individually irradiated beams was a poor predictor of dosimet- thus resulting in the c test passing easily.
ric accuracy for the tested criteria and detector methods.
Nelms et al. created four types of beam modeling errors,
5.C. Passing rates for given tolerances and
including wrong MLC transmission factors and wrong beam
corresponding action limits
penumbra.48 The error-free plans were compared with error-
induced plans. Using c criteria of 3%/3 mm, 2%/2 mm, and A number of groups have suggested metrics to assess the
1%/1 mm criteria, they found only weak to moderate correla- clinical acceptability of IMRT QA verification plans.
tions between conventional IMRT QA performance metrics Table III shows confidence limits (CL), action limits (AL),
and clinically relevant dose-volume histograms differences. tolerance limits (TL), and corresponding c thresholds
Several recent studies demonstrated that common phantom- reported in the literature. Palta et al.35 proposed confidence
based IMRT QA techniques are not highly sensitive to some limits and action levels for a range of dose regions for IMRT
MLC leaf position errors67,124,126–128 or to clinically mean- plan validation. The confidence limit was calculated as the
ingful errors67,.48,68–70,127,129 sum of the absolute value of the mean difference and the SD
Kry et al. compared IROC Houston’s IMRT head and of the differences multiplied by a factor of 1.96 (|mean devia-
neck phantoms results with those of in-house IMRT QA for tion|+1.96 SD). The mean difference used in the calculation
855 irradiations performed between 2003 and 2013.130 The of confidence limit for all regions was expressed as a per-
sensitivity and specificity of IMRT QA to detect unaccept- centage of the prescribed dose according to the formula,
able or acceptable plans were determined relative to the 100% 9 (Dcalc Dmeas)/Dprescribed. The confidence limit
IROC Houston phantom results. Depending on how the formula was based on the statistics of a normal distribution
IMRT QA results were interpreted, they showed IMRT QA which expects that 95% of the measured points will fall
results from institutions were poor in predicting a failing within the confidence limit. The confidence limit values were
IROC Houston phantom result. The poor agreement between derived from the results of an IMRT questionnaire from 30
IMRT QA and the IROC Houston phantoms highlighted the institutions and reflected how the institutions judged the clin-
inconsistency in the IMRT QA process across institutions. ical significance of tolerance limits used for IMRT QA. The
McKenzie et al. investigated the performance of several values were given as follows: (a) confidence limit of 10%
IMRT QA devices in terms of their ability to correctly iden- or 2 mm DTA and action level of 15% or 3 mm DTA for
tify dosimetrically acceptable and unacceptable IMRT pa- the high dose, high gradient region, (b) confidence limit of
tient plans, as determined by IROC-designed multiple ion 3% and action level of 5% for the high dose, low gradient
chamber phantom used as the gold standard.131 Using com- region, and (c) confidence limit of 4% and action level of
mon clinical acceptance thresholds (c criteria of 2%/2 mm, 7% for the low dose, low gradient region. Palta et al.35 sug-
3%/3 mm, and 5%/3 mm), they found that most IMRT QA gested that IMRT treatment plans should not be used
clinically if the measured and calculated doses differed by A number of groups suggested using a combination of the
more than the action level values. mean c value, maximum c value exceeded by a given per-
Using the confidence limit formalism of Palta et al.,35 centage of measurement points (e.g., 1%), and the fraction of
TG-119 reported confidence limits of 4.5% for a high-dose c values above one (P > 1) to analyze the c distributions and
point in the PTV and 4.7% for a low-dose point in an avoid- make judgments on the agreement between measurements
ance structure, both measured using an IC. Confidence limits and calculation based on clinically driven criteria.66,134–136
of 12.4% and 7%, respectively, were reported for 2D For example, Stock et al.134 used a c evaluation (3%/3 mm)
composite dose measurements made with film and arrays, relative to maximum dose for nine IMRT plans to decide the
corresponding to 87.6% and 93% c passing rates (3%/ acceptability of IMRT verification QA. They considered a
3 mm), respectively. Basran and Woo133 examined the dis- plan to meet their pass criteria if the average c, maximum c,
crepancies between calculations and 2D diode array measure- and P > 1 were < 0.5, < 1.5, and 0–5%, respectively.
ments for 115 IMRT cases. They reported acceptable De Martin et al.137 analyzed the c histograms (4%/3 mm)
tolerance limits of 3% overall, and 5% per-field, for absolute for 57 HN IMRT plans using c mean values, cD (where cD
dose differences (independent of disease site). They recom- was defined as cmean + 1.5 SD(c)), and the percentage of
mended c thresholds ≥ 95% for non-HN cases and ≥ 88% points with c < 1, c < 1.5, and c > 2. They accepted the
for HN cases using 3%/3 mm. The ESTRO38 report on IMRT verification QA depending on the confidence limit val-
Guidelines for the Verification of IMRT reported the experi- ues. They reported cD < 1 and confidence limits of 95.3%,
ence of a number of European centers. For IC verification 98.9%, and 0.4% for the percentage of points with c < 1,
measurements, the report recommended a tolerance limit of c < 1.5, and c > 2, respectively, for their newly installed
3% and an action limit of 5%. linac.
Low and Dempsey43 in 2003 proposed the need for fairly Bailey et al.103 compared measured dose planes with cal-
broad tolerances. They reported that for typical clinical use at culations for 79 HN and 25 prostate IMRT fields. Passing
the time, the fraction of points that exceed 3% and 3 mm was rates were calculated using dose difference/DTA, c evalua-
often extensive, so they used 5% and 2–3 mm as c tolerance tion, and absolute dose comparison with both local and glo-
values for IMRT clinical evaluations. Childress et al.66 in bal normalization. They reported the passing rate spread for
2005 analyzed 850 films resulting from IMRT plan verifica- the individual prostate and HN fields with the greatest differ-
tion and reported a “preferred” c index tolerance criteria of ences observed between global and local normalization meth-
5% and 3 mm. ods. For 2%/2 mm and 3%/3 mm (10% dose threshold), the
TABLE III. IMRT verification QA confidence limits (CL), action limits (AL), tolerance limits (TL), and corresponding c thresholds reported in the literature.
Delivery
Author year technique Dosimeter Number of irradiation Reported/recommended tolerance levels
Palta 200335 Fixed-gantry Not specified Results from an IMRT CL and AL: 10%/2 mm and 15%/3 mm (high dose, steep
questionnaire of 30 gradient);
institutions CL and AL: 3% and 5% (high dose, low gradient);
CL and AL: 4% and 7% (low dose, low gradient)
Low 200343 Fixed-gantry N/A Simulated fields c index tolerance criteria: 5%/2–3 mm
mimicking clinical
fields
Childress 200566 Fixed-gantry Film 858 fields c index tolerance criteria: 5%/3 mm
Stock 2005134 Fixed-gantry Film, IC 10 plans c index (3%/3 mm): cmean < 0.5, cmax < 1.5, and fraction of
c>1 0–5%
De Martin 2007135 Fixed-gantry Film, IC 57 HN plans c index (4%/3 mm): cD [cmean + 1.5 SD(c)] < 1;
c threshold (4%/3 mm): c<1 > 95.3%, c<1.5 > 98.9%,
c>2 < 0.4%
ESTRO 200838 Fixed-gantry IC Not specified TL: 3%
AL: 5%
Basran 2008108 Fixed-gantry 2D diode array 115 plans TL: 3% overall, 5% per-field (independent of disease site);
c threshold (3%/3 mm): ≥95% (non-HN cases);
c threshold (3%/3 mm): ≥88% (HN cases)
Ezzell 200916 Fixed-gantry and Film, IC, 2D 10 institutions, 5 CL: 4.5% (high-dose point in PTV);
Tomotherapy diode array from-easy-to-difficult CL: 4.7% (low-dose point in OAR);
cases per institution CL: 12.4% (film composite), 87.6% passing (3%/3 mm);
CL: 7% (per-field), 93.0% passing (3%/3 mm)
Carlone 2013136 Fixed-gantry 2D diode array 85 prostate plans (68 c threshold (2%/2 mm): 78.9% (r~ 3 mm), 84.6%
modified with random (r~ 2 mm), 89.2% (r~ 1 mm);
MLC errors) c threshold (3%/3 mm): 92.9% (r~ 3 mm), 96.5%
(r~ 2 mm), and 98.2% (r~ 1 mm).
prostate c passing rates were 80.4% and 96.7% for global (evaluated) and MC (reference) dose distributions.139 The
normalization and 66.3% and 90.8% for local normalization, analysis was performed using a variety of dose differ-
respectively. On the other hand, the HN passing rates were ence (5%, 3%, 2%, and 1%) and DTA (5, 3, 2, and 1 mm)
77.9% and 93.5% for global normalization and 50.5% and acceptance criteria, low-dose thresholds (5%, 10%, and
70.6% for local normalization, respectively. 15%), and grid sizes (1.0, 1.5, and 3.0 mm). A small differ-
Carlone et al.138 investigated the use of receiver operating ence between 2D and 3D c passing rates of 0.8% for 3%/
characteristic (ROC) methods in order to set tolerance limits 3 mm and 1.7% for 2%/2 mm was reported with no low-dose
for c evaluations. They used a group of 17 prostate plans that threshold and a 1 mm grid size. 3D c analysis produced better
was delivered as planned and a second group of 17 prostate agreement than the corresponding 2D analysis. The addi-
plans that was modified by inducing random MLC position tional degree of searching increased the percent of pixels
errors. The errors were normally distributed with r 0.5, passing c by up to 2.9% in 3D analysis. The greatest differ-
1.0, 2.0, and 3.0 mm. A total of 68 modified plans were ence between 2D and 3D c results was caused by increasing
created and evaluated using five different c criteria (5%/ the dose difference and DTA criteria.
5 mm, 4%/4 mm, 3%/3 mm, 2%/2 mm, 1%/1 mm). The
dose threshold used during the c evaluation process was not
6. VENDOR SURVEY AND ALGORITHM TESTING
reported. All plans were delivered on a 2D detector array sys-
tem with 7 mm detector spacing. Plots of the fraction of In order to better understand the commercial implementa-
fields with a passing rate greater than a user-defined thresh- tion of IMRT QA c analysis software, TG-218 contacted the
old ranging between 0% and 100% were plotted against pass vendors and provided them with a set of questions displayed
rate percentage. Plots were generated for each combination of in Table IV and test cases. The tests examined vendor imple-
the five c criteria and four r. A total of 20 ROC curves were mentations of the c verification algorithm employing bench-
then generated by varying the pass rate threshold and com- mark cases developed by TG-218.
puting for each point the fraction of failed modified plans,
and the fraction of passed unmodified plans. ROC evaluation
was performed by quantifying the fraction of modified plans TABLE IV. Vendor survey questionnaire on the implementation of IMRT QA
reported as “fail” and unmodified plans reported as “pass”. c analysis software.
Optimal tolerance limits were derived by determining which 1. Do you perform interpolation between points in the dose image__, if so, to
criteria maximized sensitivity and specificity. Specifically, an what resolution________?
optimal threshold was identified by the point on the area 2. Do you resample one or both images for the c analysis____? If so, on what
under the ROC curve closest to the point where sensitivity basis and to what
and specificity equaled 1. resolution_____________________________________________?
While the c criteria were able to achieve nearly 100% sen- 3. Which image is considered the reference image for the c analysis, plan or
measured__________? Is this user selectable?______
sitivity/specificity in the detection of large random MLC
4. Can you use an acquired and plan dose image that are each in standard
errors (r > 3 mm), sensitivity and specificity decreased for
DICOM RT format?______________________________
all c criteria as the size of error to be detected decreased
5. What search radius do you use, is it user selectable?____________
below 2 mm. The optimal passing threshold values for 2%/
6. Do you offer both relative and absolute dose modes?__________
2 mm were 78.9% (r = 3 mm), 84.6% (r = 2 mm), and
7. Is your dose tolerance part of the c analysis referred to the local dose or
89.2% (r = 1 mm). The optimal passing threshold values for maximum dose or other____? Is that user selectable_________?
3%/3mm were 92.9% (r = 3 mm), 96.5% (r = 2 mm), and 8. Do you specify the dose threshold value above which the analysis will take
98.2% (r = 1 mm). Based on the ROC analysis, Carlone place? If so, what is the dose threshold?___ Is this value user
et al. concluded that the predictive power of patient-specific selectable_________?
QA was limited by the size of error to be detected for the 9. Do you offer plan-to-acquired-dose image auto-registration___? Manual
IMRT QA equipment used in their center. registration___? Assume center of each image is point in
common__________?
Bresciani et al.106 evaluated the variability of local and
10. For relative mode, how do you normalize the acquired and plan dose
global analysis for Tomotherapy plans using 3%/3 mm, 2%/
images-
2 mm, and 1%/1 mm, each with both local and global nor-
• at maximum?,
malization. They reported mean passing rates for local (glo- • to an area,
bal) normalization of 93% (98%) for 3%/3 mm, 84% (92%) • to a user selectable point?
for 2%/2 mm, and 66% (61%) for 1%/1 mm. They investi- • Other ?_______________________________
gated the effect to excluding points below a 5% or 10% dose
threshold and found that the choice between these thresholds 11. Do you perform % dose difference-DTA(Van Dyk analysis)? __________
did not affect the passing rate. They concluded that the vari- 12. If so, how do you normalize the acquired and plan dose images-
ability in passing rates observed in their work showed the • at maximum of the reference image?,
need to establish new agreement criteria that could be univer- • an area,
• to a user selectable point?
sal and comparable between institutions. • Other ? ________________________________
Pulliam et al. performed 2D and 3D c analyses for 50
IMRT plans by comparing collapse-cone convolution TPS
TABLE V. Vendora responses to the questionnaire on IMRT QA c analysis software listed in Table IV.b
SNC Portal OP
QA question # patient 3DVH dosimetry RIT113 IMSure Delta 4 VeriSoft Compass IMRT
1. Perform interpolation between points: Yes Yes No No Yes Yes No Yes Yes
2. Resample one or both images for c analysis: No No Yes Yes Yes Yes Yes Yes Yes
3. Reference image is plan or measurement: Measured Plan Plan Both Plan Measured Measured Plan Both
User selectable No No No Yes No No No No Yes
4. Can user acquire and plan dose image in DICOMRT: Yes Yes Yes Yes No Yes Yes Yes Yes
5. DTA search radius (mm): 8 5 10 30 2.5xDTA 3
User selectable No No No Yes No No Yes No Yes
6. Offer both relative and absolute dose modes: Yes No Yes Yes No Yes Yes Yes Yes
7. Dose tolerance part of c user selectable: Yes Yes Yes Yes Yes Yes Yes Yes Yes
Local dose Yes Yes Yes Yes Yes Yes Yes Yes Yes
Max dose Yes Yes Yes Yes Yes Yes Yes Yes Yes
8. Dose threshold above c analysis occurs: 0–100% 0–100% 0–100% 0–100% 0–100% 0–30% 10cGy 0–100%
User selectable Yes Yes Yes Yes Yes Yes Yes Yes Yes
9. Registration between plan and measurement: Yes Yes Yes Yes No Yes Yes Yes Yes
Auto registration Yes Yes Yes Yes Yes Yes Yes No
Manual registration Yes No Yes Yes Yes Yes No Yes
Assume center of each image as common point Yes Yes No No No Yes No
10. Relative mode, normalize plan/measurement to: Yes No Yes Yes Yes Yes Yes Yes Yes
At maximum Yes No Yes Yes Yes Yes Yes Yes Yes
To an area No No No Yes Yes Yes Yes No Yes
User selectable point Yes No Yes Yes Yes Yes Yes Yes
Others Yes
11. Perform % dose difference/DTA(Van Dyk analysis): Yes Yes No Yes No Yes No Yes Yes
Normalize to maximum of reference image Yes Yes Yes Yes Yes
Normalize to an area No No Yes Yes Yes
Normalize to a user selectable point Yes No Yes Yes Yes Yes Yes
Others Yes
a
Vendors: 3DVH and SNC Patient for MapCHECK and ArcCHECK (Sun Nuclear Corporation, Melbourne, FL, USA), Portal Dosimetry with EPID (Varian Medical Sys-
tems, Palo Alto, CA, USA), RIT 113 (Radiological Imaging Technology, Inc, Colorado Springs, CO, USA), IMSure (Standard Imaging Inc, Middleton, WI, USA), Delta4
(ScandiDos, Uppsala, Sweden), VeriSoft with Seven29 2D array (PTW, Freiburg, Germany), COMPASS and OmniPro-IMRT with MatriXX (IBA dosimetry, Schwarzen-
bruck, Germany).
b
Survey was conducted in 2014.
distributions were given to test the basic functionality of their either the radiation dose gradient or the dose levels. Figure 7
dose comparison tools, including the dose difference and c shows examples of the reference and evaluated dose distribu-
tools. tions. Also tested was a dose distribution acquired from a
There were two types of distributions provided: (a) mathe- clinical IMRT treatment plan and the measured 2D film dose
matically defined distributions and (b) distributions based on shown in Fig. 8.
a clinical treatment plan. In both cases, the distributions were The method by Ju et al.55 was used as the benchmark
2D with 0.5 mm resolution. The mathematically defined (gold standard) for the commercial calculation evaluation.
dose distribution for a circular field contained three distinct The percentage of points passing dose difference and DTA
regions: first, a high and homogeneous dose area in the cen- tolerances of all combinations of 2%, 3%, 2 mm, and 3 mm,
tral region set to 200 cGy with flat gradient; second, a linear were utilized. For the mathematically defined plans, two low-
dose gradient (50% cm 1) next to the central region, and dose thresholds of 4% and 5.5% were also utilized. The dose
third a homogeneous low-dose region set to 8 cGy surround- threshold to exclude low-dose areas in the clinical plans was
ing the high dose and linear gradient regions. The evaluated set to zero.
dose distributions for the mathematically defined distribution Five vendors responded with results. Table VI shows the
were the reference distributions perturbed by modifying relative c passing rate for these vendors for the
(a) (b)
(c)
FIG. 7. Dose distributions for the mathematical case (circular-shape field) sent to vendors to test c calculations. (a) Reference dose distribution (resolution
0.5 mm). (b) Evaluated dose distribution (resolution 0.5 mm). (c) Overlaid dose profiles showing the differences between the two distributions.
(a) (b)
FIG. 8. Clinical dose distributions from a clinical IMRT plan sent to vendors to test c calculations. (a) TPS calculated dose distribution (resolution 0.5 mm). (b)
Film measurement (resolution 0.5 mm).
mathematically defined plans. Table VII shows the relative is a source of variation. Another source of variation is the
passing rates from clinical plan shown in Fig. 8. For the complexity of each IMRT case, for example, intensity modu-
mathematical and clinical tests, some vendors used the refer- lation differences between head and neck and prostate IMRT
ence image as evaluated and the evaluated image as reference cases. A process view of IMRT QA includes all sources of
due to their software design. As these data show, the passing variation mentioned in Section 1.C. as well as the human and
rates for the mathematically defined tests were calculated case-specific issues. Accounting for all aspects of variation in
within 0.1% for two vendors, and most other vendors were IMRT QA can be achieved by setting process-based tolerance
within 6%. Vendor E had consistently greater passing rates and action limits.
than the other vendors or the gold standard for the mathemat- Action limits should set a minimum level of process perfor-
ical tests. The clinical plan showed more variation than the mance such that IMRT QA measurements outside the action
mathematically defined tests. Comparing the film dose image limits could result in a negative clinical impact for the patient.
to the TPS dose, agreement with the gold standard was within Tolerance limits refer to the range within which the IMRT QA
6% across all vendors. process is considered to be unchanging. An out-of-control pro-
These results indicate that the vendors are not using a stan- cess serves as a warning that the process might be changing. If
dardized approach to implementing the dose comparison an IMRT QA measurement is outside the tolerance limits but
tests. Given that the mathematically defined tests showed within action limits, it is left up to the medical physicist to
excellent albeit not perfect agreement, the discrepancies in determine whether or not action should be taken.
the clinical case are likely due to the methods used to align Action limits come in two categories: (a) those that are
the doses or handling of image resolutions. This highlights universally defined and guided by outcomes data and expert
the fact that the user should understand how their vendor has consensus, and (b) those that are locally defined and guided
implemented the algorithm and should run benchmark test by local experience. For any QA measurement, it is desirable
cases against their algorithm to evaluate its accuracy. to use universal action limits as these should be directly cor-
related with treatment outcome. This implies that there is
some clinical evidence or a least consensus agreement among
7. PROCESS-BASED TOLERANCE AND ACTION
experts perhaps guided by summary statistics of retrospective
LIMITS
data to inform the choice of those action limits. An example
Although not explicitly mentioned in Section 1.C., there of universally defined action limits are those for treatment
is a human contribution to every IMRT QA measurement that machine output because there is a direct correspondence
3 3 4 95.1 95.1 0.0 95.3 0.2 95.1 0.0 95.1 0.0 96.4 1.4
3 2 4 90.2 90.2 0.0 89.8 0.4 90.2 0.0 90.2 0.0 91.4 1.3
2 3 4 93.4 93.4 0.0 93.7 0.3 93.4 0.0 93.4 0.0 94.7 1.4
2 2 4 88.3 88.3 0.0 88.2 0.1 88.3 0.0 88.3 0.0 89.6 1.5
3 3 5.5 81.9 81.9 0.0 81.6 0.4 80.4 1.8 81.9 0.0 85.8 4.8
3 2 5.5 63.1 63.1 0.0 59.8 5.2 61.0 3.3 63.1 0.1 65.8 4.3
2 3 5.5 75.3 75.3 0.0 75.1 0.3 73.9 1.9 75.2 0.1 79.0 4.9
2 2 5.5 55.4 55.4 0.0 53.6 3.3 53.4 3.6 55.4 0.1 58.5 5.6
3 3 0 98.7 98.5 0.2 96.7 2.0 98.5 0.2 98.5 0.2 97.0 1.7
3 2 0 97.1 96.5 0.6 94.5 2.7 96.5 0.6 96.5 0.6 95.4 1.8
2 3 0 96.6 96.1 0.5 91.6 5.2 96.1 0.5 96.1 0.5 92.3 4.5
2 2 0 93.0 92.1 1.0 87.2 6.2 92.1 1.0 92.1 1.0 88.7 4.6
between treatment outcome and output. Exceeding action lines (called control limits) and a center line that are calcu-
limits that are locally defined do not necessarily result in lated using the IMRT QA measurements.140–142 Out-of-con-
harm to a patient when exceeded, but in the interest of good trol process behavior is indicated when any one IMRT QA
patient care, it is deemed best to keep process performance measurement is outside the upper or lower control limits on
within those limits. Patient-specific IMRT QA is an example the I-chart. The IMRT QA measurements should be some-
of action limits being set in this fashion. Locally defined what equally distributed above and below the center line. The
action limits may vary from institution to institution or case- center line, upper control limit, and lower control limits for
type to case-type since those limits are based on local equip- an I-chart are calculated using the following equations:
ment, processes, and case types as well as the experience of n
1X
the local physicist. center line ¼ x (4)
Using methods from statistical process control,140–142 n 1
IMRT QA measurements can be used to determine action
limits when universal action limits are not appropriate. upper control limit ¼ center line þ 2:660 mR (5)
Action limits determined in this fashion can be procedure-, lower control limit ¼ center line 2:660 mR (6)
equipment-, and site-specific for each individual institution
and are calculated using the following equation,45 where x is an individual IMRT QA measurement, n is the
n
total number of measurements, and mR ¼ n 1 1 jxi xi 1 j is
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P
DA ¼ b r2 þ ðx T Þ2 (3) i¼2
the moving range.
where DA is the difference between the upper and lower In this procedure, the control limits are used at the toler-
action limits, typically written as A=2. T is the process tar- ance limits. Establishing process control is a key element of
get value and r2 and x are the process variance and process this procedure because a controlled process is an indication
mean, respectively. The constant b is a combination of two that the process is stable and suitable for the purpose of
factors. One factor originates from the process capability IMRT QA. Using the proposed procedure requires a different
metric, Cpm , as a cutoff for an acceptably performing pro- view of QA such that measurements provide a description of
cess143 and is combined with another factor that balances the entire process (people + equipment + procedures) and not
type I errors (rejecting the null hypothesis when it is true) just the hardware and software equipment by itself. It is
and type II errors (not rejecting the null hypothesis when it is important to note tolerance limits will depend on plan com-
false) when using an IMRT QA measurement to make a deci- plexity due to a greater case-to-case measurement variability.
sion about process performance. In using IMRT QA mea- Therefore, it can be good practice to calculate tolerance limits
surements to make decisions about process performance, the separately for cases with high plan complexity and those with
null hypothesis is that the process is unchanging. Current lower plan complexity, for example, for head and neck IMRT
information suggests that b = 6.0 is an appropriate value to QA compared to prostate IMRT QA.
use45 although this may be refined upon further research. Two examples of process-based tolerance and action limits
Using Eq. (3) will likely result in action limits than are wider are provided here to illustrate the procedure described in this
than what is currently accepted but should allow medical section. The processes are VMAT and fixed-gantry IMRT
physicists to focus on problems with patient-specific IMRT QA and the c passing rates with 3%/2 mm and a 10% dose
QA that are likely to have identifiable causes. If the target, T, threshold are used as the QA measure using an ArcCHECK
is known as in the example of patient-specific IMRT QA and MapCHECK device, respectively. Considering the
point-dose difference (i.e., 0%) or c passing rate (i.e., 100%), VMAT QA example with 20 c passing rates with an average
then the known target value should be used. If the target value of 96.66%, standard deviation of 1.739%, and moving range
is unknown or not defined, then the process average can be of 1.905%, then the action limit range following Eq. (3) is
used as a best estimate of the target. This latter approach will 22.6% which translates to an action limit of 100–22.6%/
have the effect of tightening the action limits compared to the 2 = 88.7% (note that c is bounded at 100% for the upper
former approach. limit). As long as the process is not out-of-control, the control
In this procedure, the process average, x, and variance, r2 , chart limit will be used as the tolerance limits calculated
are calculated from the IMRT QA measurements over a time using Eq. (6) which is 96.66%–2.6601.905% = 91.6% for
period when the process does not display out-of-control this example. For the fixed-gantry IMRT QA example, the 20
behavior. If the process is out-of-control, then one must iden- c passing rates with an average of 95.92%, standard deviation
tify and remove the reason for the out-of-control process of 3.388%, and moving range of 3.642%, then the action
behavior and continue monitoring the process until it displays limit is 84.1% and the tolerance limit is 86.2%. In the case of
a degree of control for about an additional 20 IMRT QA mea- c passing rates, the upper tolerance (control) limit and action
surements. Then, control chart limits from an I-chart of indi- limit are bounded and equal to 100%.
vidual IMRT QA measurements are used as the tolerance The last step in the procedure is to compare the tolerance
limits. The I-chart is a statistical tool that helps identify any limits to the action limits. For example if the c tolerance lim-
IMRT QA measurement that displays abnormal (out-of-con- its are lower than the action limits, then either the process
trol) process behavior. The I-chart has upper and lower limit needs to be fixed or the action limits lowered (i.e., use a
larger value of b in Eq. (3)). Fixing the process may require QA results, analysis of widely used IMRT QA delivery and
new or modified equipment or training of personnel perform- evaluation methods, and operational details that can
ing the IMRT QA measurements and analysis. Using this improve the effectiveness and accuracy of the c method.
standardized procedure for setting action and tolerance limits End-to-End QA verification tests for the IMRT TPS and
will allow medical physicists to compare IMRT QA processes IMRT delivery equipment, along with patient-specific veri-
across institutions. The full procedure is summarized in fication QA are required to evaluate the accuracy of radia-
Fig. 9. tion delivery to patients. Tolerances, action limits, and
pass/fail criteria should be defined to evaluate the accept-
ability of IMRT QA verification plans.
8. RECOMMENDATIONS We recommend the following terminology as it pertains to
IMRT QA delivery methods (see Section 3).
8.A. IMRT QA, tolerance limits, and action limits
Published data on IMRT QA results and the clinical • Perpendicular field-by-field (PFF): the radiation beam
experience of the group members were used to develop is perpendicular to the plane of the measurement
guidelines and provide recommendations on universal toler- device. The device can be placed on the couch or
ance and action limits for IMRT verification QA using the attached to the gantry head. The dose from each of the
c method. This included in-depth literature review of IMRT IMRT beams is delivered and analyzed.
FIG. 9. Flow chart outlines the procedure for setting tolerance and action limits for IMRT QA.
Perpendicular composite (PC): the radiation beam is structure dose tolerance exceeds 10% of the prescrip-
• tion dose. This allows the c passing rate analysis to
always perpendicular to the measurement device detec-
tor plane. The device can be placed on the couch or ignore the large area or volume of dose points that lie
attached to the gantry head. The doses from all IMRT in very low-dose regions which, if included, would
radiation beams are delivered and subsequently tend to increase the passing rate when global normal-
summed. ization is used.
• True Composite (TC): all of the radiation beams are Tolerance and action limits (terms were defined in Sec-
delivered to a stationary measurement device in a phan- tion 1.C.) are the foundation for a robust IMRT QA verifica-
tom placed on the couch using the actual treatment tion process. We make the following recommendations
beam geometry for the patient, including MUs, gantry, regarding tolerance limits and action limits for evaluating the
collimator, couch angles, jaws, and MLC leaf positions. IMRT QA analysis, including measurements. The limits are
This method most closely simulates the treatment deliv- the same for PFF and TC delivery methods, and assume the
ery to the patient. tolerance and action limits are coincident with the goals of
the treatment plan. If they are not, for example stereotactic
We make the following recommendations for IMRT QA radiosurgery (SRS) and stereotactic body radiotherapy
verification of dose distributions (fixed-gantry IMRT and (SBRT) cases, tighter tolerances should be considered. The
rotational IMRT): following recommendations are for c analysis using global
IMRT QA measurements should be performed using normalization in absolute dose:
•
the TC delivery method provided that the QA device Universal tolerance limits: the c passing rate should be
has negligible angular dependence or the angular
•
≥ 95%, with 3%/2 mm and a 10% dose threshold.
dependence is accurately accounted for in the vendor Universal action limits: the c passing rate should be
software.
•
≥ 90%, with 3%/2 mm and a 10% dose threshold.
• IMRT QA measurements should be performed using
the PFF delivery method if the QA device is not suit- ○ If the plan fails this action limit, evaluate the c failure
able for TC measurements, or for TC verification error distribution and determine if the failed points lie in
analysis. regions where the dose differences are clinically irrel-
• IMRT QA measurements should not be performed evant, in which case the plan may be clinically
using the PC delivery method which is prone to mask- acceptable. If the c failure points are distributed
ing delivery errors. throughout the target or critical structures and are at
• Analysis of IMRT QA measurements and the corre- dose levels that are clinically relevant, the plan
sponding treatment plan should be performed in abso- should not be used and the medical physicist should
lute dose mode, not relative dose (the user should not follow the steps outlined in section (b) below. It may
normalize the dose to a point or region, ie., relative dose be necessary to review results with a different detec-
mode). tor or different measurement geometry. For example,
• A dose calibration measurement compared against a if the failure is seen with the TC delivery, a PFF anal-
standard dose should be performed before each mea- ysis can be valuable to further explore the discrepan-
surement session to factor the variation of the detector cies between calculations and measurements.
response and accelerator output into the IMRT QA
measurement. Equipment- and site-specific limits can be set following
Global normalization should be used. Global normal-
•
• the method described in Section 7.
ization is deemed more clinically relevant than local
normalization. The global normalization point should ○ If action limits are determined that are significantly
be selected whenever possible in a low gradient region lower than the universal action limits recommended
with a value that is ≥ 90% of the maximum dose in the above, then action should be taken as outlined in sec-
plane of measurement. This will provide a more realis- tion (b) below to improve the IMRT QA process.
tic measure of the comparison between the two dose From a process perspective, strict adherence to stan-
distributions. dardized procedures and equipment as well as addi-
• Local normalization is more stringent than global nor- tional training may also be necessary.
malization for routine IMRT QA. It can be used during
the IMRT commissioning process and for troubleshoot- Tighter criteria should be used, such as 2%/1 mm or
ing IMRT QA.
•
1%/1 mm to detect subtle regional errors and to discern
• The dose threshold should be set to exclude low-dose if the errors are systematic for a specific treatment site
areas that have no or little clinical relevance but can or delivery machine.
significantly bias the analysis. An example is setting For IMRT QA performed with an IC and film, tolerance
the threshold to 10% in a case where the critical
•
and action limits for the ion chamber measurement
should be within ≤ 2% and ≤ 3%, respectively, and the The accuracy of IMRT delivery can be affected by dif-
film c passing rate limits should be assessed as speci- ferences and limitations in the design of the MLC and
fied above. An IMRT treatment plan should not be used accelerators among the different manufacturers, including
if the chamber measurement error or the c passing rate the treatment head design, as well as the age of the acceler-
exceeds the universal action limits. ator/equipment. In addition, IMRT dosimetry QA equip-
• For any case with c passing rate less than 100%, ment design, tumor sites (e.g., HN vs. prostate), complexity
of the IMRT plans, uncertainties, inaccuracies and toler-
○ the c distribution should be carefully reviewed rather ances in the planning, delivery, and measurement may
than relying only on distilled statistical evaluations, affect the IMRT QA verification results. For centers with
○ review of c results should not be limited to only the IMRT QA results that are unable to meet the tolerance and
percentage of points that fail, but should include action limit values recommended in this report, the centers
other relevant c values (maximum, mean, minimum, should perform a comprehensive analysis to determine the
median), as well as a histogram analysis. sources for these differences and remedy them. For exam-
○ An analysis of the maximum c value and the per- ple, the use of statistical process control methods can be
centage of points that exceed a c value of 1.5 useful in identifying the outlier cases failing to pass the tol-
should be performed. For a 3%/2 mm, a c value of erance limits for in-depth analysis.44,45 Also, it can be help-
1.5 could indicate a dose difference of 4.5% in a ful to perform the TG-119 recommended tests and then
low-dose gradient region or a DTA of ~3.0 mm in compare to the published results or conduct independent
a steep dose gradient region. Both of these are tests using the IROC Houston IMRT phantoms. The recom-
examples of failures, but failures that exceed toler- mendations and guidelines provided in this report can be
ances by 1.5% and 1 mm in the low and steep dose applied for any modulated treatment fields regardless of the
gradient regions, respectively. Such information system used to generate them. Finally, future research
should be used to deduce clinical relevance when- efforts should be focused on further improving the correla-
ever possible (e.g., cluster of failing points near or tion between IMRT QA evaluation metrics and underlying
at the boundary of a tumor and critical structure). planning or delivery errors.
8. Wang X, Spirou S, LoSasso T, Stein J, Chui CS, Mohan R. Dosi- 33. Ibbott GS, Followill DS, Molineu HA, Lowenstein JR, Alvarez PE,
metric verification of intensity-modulated fields. Med Phys. Roll JE. Challenges in credentialing institutions and participants in
1996;23:317. advanced technology multi-institutional clinical trials. Int J Radiat
9. Chui CS, Spirou S, LoSasso T. Testing of dynamic multileaf collima- Oncol Biol Phys. 2008;71:S71–S75.
tion. Med Phys. 1996;23:635. 34. Palta JR, Deye JA, Ibbott GS, Purdy JA, Urie MM. Credentialing of
10. Low DA, Chao K, Mutic S, Gerber RL, Perez CA, Purdy JA. “Quality institutions for IMRT in clinical trials. Int J Radiat Oncol Biol Phys.
assurance of serial tomotherapy for head and neck patient treatments. 2004;59:1257–1259.
Int J Radiat Oncol Biol Phys. 1998;42:681–692. 35. Palta JR, Kim S, Li J, Liu C. Tolerance limits and action levels for
11. Ling CC, Zhang P, Archambault Y, Bocanek J, Tang G, Losasso T. planning and delivery of IMRT. In: Palta JR, Mackie TR, eds. Inten-
Commissioning and quality assurance of RapidArc radiotherapy deliv- sity-Modulated Radiation Therapy: The State of Art. Madison: Medical
ery system. Int J Radiat Oncol Biol Phys. 2008;72:575–581. Physics Publishing; 2003:593–612.
12. Kaurin DG, Sweeney LE, Marshall EI, Mahendra S. VMAT testing for 36. Das IJ, Cheng CW, Chopra KL, Mitra RK, Srivastava SP, Glatstein E.
an Elekta accelerator. J Appl Clin Med Phys. 2012;13:3725. Intensity-modulated radiation therapy dose prescription, recording, and
13. Otto K. Volumetric modulated arc therapy: IMRT in a single gantry delivery: patterns of variability among institutions and treatment plan-
arc. Med Phys. 2008;35:310–317. ning systems. J Natl Cancer Inst. 2008;100:300–307.
14. Xing L, Chen Y, Luxton G, Li J, Boyer A. Monitor unit calculation for 37. LoSasso T, Chui C-S, Ling CC. Comprehensive quality assurance for
an intensity modulated photon field by a simple scatter-summation the delivery of intensity modulated radiotherapy with a multileaf colli-
algorithm. Phys Med Biol. 2000;45:N1. mator used in the dynamic mode. Med Phys. 2001;28:2209–2219.
15. Ezzell GA, Galvin JM, Low D, et al. Guidance document on delivery, 38. Alber M, Broggi C, De Wagter C, et al. Guidelines for the verification
treatment planning, and clinical implementation of IMRT: report of the of IMRT. ESTRO booklet, 2008.
IMRT subcommittee of the AAPM radiation therapy committee. Med 39. Childress NL, Bloch C, White RA, Salehpour M, Rosen II. Detection
Phys. 2003;30:2089–2115. of IMRT delivery errors using a quantitative 2D dosimetric verification
16. Ezzell GA, Burmeister JW, Dogan N, et al. IMRT commissioning: mul- system. Med Phys. 2005;32:153–162.
tiple institution planning and dosimetry comparisons, a report from 40. Chuang CF, Verhey LJ, Xia P. Investigation of the use of MOSFET for
AAPM Task Group 119. Med Phys. 2009;36:5359–5373. clinical IMRT dosimetric verification. Med Phys. 2002;29:1109–1115.
17. Low DA, Moran JM, Dempsey JF, Dong L, Oldham M. Dosimetry 41. Godart J, Korevaar EW, Visser R, Wauben DJ, Van’t Veld AA. Recon-
tools and techniques for IMRT. Med Phys. 2011;38:1313–1338. struction of high-resolution 3D dose from matrix measurements: error
18. Moran JM, Dempsey M, Eisbruch A, et al. Safety considerations for detection capability of the COMPASS correction kernel method. Phys
IMRT: executive summary. Med Phys. 2011;38:5067. Med Biol 2011;56:5029–5043.
19. Pawlicki T, Yoo S, Court LE, et al. Moving from IMRT QA measure- 42. Han Z, Ng SK, Bhagwat MS, Lyatskaya Y, Zygmanski P. Evaluation of
ments toward independent computer calculations using control charts. MatriXX for IMRT and VMAT dose verifications in peripheral dose
Radiother Oncol. 2008;89:330–337. regions. Med Phys. 2010;37:3704–3714.
20. Fan J, Li J, Chen L, et al. A practical Monte Carlo MU verification tool 43. Low DA, Dempsey JF. Evaluation of the gamma dose distribution com-
for IMRT quality assurance. Phys Med Biol. 2006;51:2503–2514. parison method. Med Phys. 2003;30:2455–2464.
21. Leal A, Sanchez-Doblado F, Arrans R, Rosello J, Pavon EC, Lagares 44. Pawlicki T, Yoo S, Court LE, et al. Process control analysis of
JI. Routine IMRT verification by means of an automated Monte Carlo IMRT QA: implications for clinical trials. Phys Med Biol.
simulation system. Int J Radiat Oncol Biol Phys. 2003;56:58–68. 2008;53:5193–5205.
22. Agnew A, Agnew CE, Grattan MWD, Hounsell AR, McGarry CK. 45. Sanghangthum T, Suriyapee S, Kim G-Y, Pawlicki T. A method of set-
Monitoring daily MLC positional errors using trajectory log files and ting limits for the purpose of quality assurance. Phys Med Biol.
EPID measurements for IMRT and VMAT deliveries. Phys Med Biol. 2013;58:7025–7037.
2014;59:N49–N63. 46. Brualla-Gonzalez L, Gomez F, Vicedo A, et al. A two-dimensional liq-
23. Rangaraj D, Zhu M, Yang D, et al. Catching errors with patient-speci- uid-filled ionization chamber array prototype for small-field verifica-
fic pretreatment machine log file analysis. Pract Radiat Oncol. tion: characterization and first clinical tests. Phys Med Biol.
2013;3:80–90. 2012;57:5221–5234.
24. Stell AM, Li JG, Zeidan OA, Dempsey JF. An extensive log-file analy- 47. Duan J, Shen S, Fiveash JB, Brezovich IA, Popple RA, Pareek PN.
sis of step-and-shoot intensity modulated radiation therapy segment Dosimetric effect of respiration-gated beam on IMRT delivery. Med
delivery errors. Med Phys. 2004;31:1593–1602. Phys. 2003;30:2241–2252.
25. Hartford AC, Galvin JM, Beyer DC, et al. American College of Radiol- 48. Nelms BE, Zhen H, Tome WA. Per-beam, planar IMRT QA passing
ogy (ACR) and American Society for Radiation Oncology (ASTRO) rates do not predict clinically relevant patient dose errors. Med Phys.
practice guideline for intensity-modulated radiation therapy (IMRT). 2011;38:1037.
Am J Clin Oncol. 2012;35:612–617. 49. Bogner L, Scherer J, Treutwein M, Hartmann M, Gum F, Amediek A.
26. Bogdanich W. Radiation Offers New Cures, and Ways to Do Harm. Verification of IMRT: techniques and Problems. Strahlenther Onkol.
New York: The New York Times, 2010. 2004;180:340–350.
27. Bogdanich W. As Technology Surges, Radiation Safeguards Lag. New 50. Harms WB Sr, Low DA, Wong JW, Purdy JA. A software tool for the
York: The New York Times, 2010. quantitative evaluation of 3D dose calculation algorithms. Med Phys.
28. Smith JC, Dieterich S, Orton CG. It is STILL necessary to validate 1998;25:1830–1836.
each individual IMRT treatment plan with dosimetric measurements 51. Low DA, Harms WB, Mutic S, Purdy JA. A technique for the quantita-
before delivery. Med Phys. 2011;38:553–555. tive evaluation of dose distributions. Med Phys. 1998;25:656–661.
29. Siochi RAC, Molineu A, Orton CG. Patient-specific QA for IMRT 52. Moran JM, Radawski J, Fraass BA. A dose-gradient analysis tool for
should be performed using software rather than hardware methods. IMRT QA. J Appl Clin Med Phys. 2005;6:62–73.
Med Phys. 2013;40:0706011–0706013. 53. Bakai A, Alber M, Nusslin F. A revision of the gamma-evaluation con-
30. Kruse JJ, Mayo CS. Comment on “Catching errors with patient-speci- cept for the comparison of dose distributions. Phys Med Biol.
fic pretreatment machine log file analysis”. Pract Radiat Oncol. 2003;48:3543–3553.
2012;3:91–92. 54. Childress NL, Rosen II. The design and testing of novel clinical param-
31. Fraass B, Doppke K, Hunt M, et al. American Association of Physicists eters for dose comparison. Int J Radiat Oncol Biol Phys 2003;56:1464–
in Medicine Radiation Therapy Committee Task Group 53: quality 1479.
assurance for clinical radiotherapy treatment planning. Med Phys. 55. Ju T, Simpson T, Deasy JO, Low DA. Geometric interpretation of the
1998;25:1773–1829. gamma dose distribution comparison technique: interpolation-free cal-
32. Van Dyk J, Barnett RB, Cygler JE, Shragge PC. Commissioning and culation. Med Phys. 2008;35:879–887.
quality assurance of treatment planning computers. Int J Radiat Oncol 56. Palta JR, Liu C, Li JG. “Quality assurance of intensity-modulated radi-
Biol Phys. 1993;26:261–273. ation therapy. Int J Radiat Oncol Biol Phys. 2008;71:S108–S112.
57. Nelms BE, Simon JA. A survey on planar IMRT QA analysis. J Appl gel phantom and 3D magnetic resonance dosimetry for verification of
Clin Med Phys. 2007;8:2448. IMRT treatment plans. Phys Med Biol. 2002;47:N67–N77.
58. Childress NL, Salehpour M, Dong L, Bloch C, White RA, Rosen II. 81. Gorjiara T, Hill R, Kuncic Z, et al. Investigation of radiological proper-
Dosimetric accuracy of Kodak EDR2 film for IMRT verifications. Med ties and water equivalency of PRESAGEâ dosimeters. Med Phys.
Phys. 2005;32:539–548. 2011;38:2265–2274.
59. Olch AJ. Dosimetric performance of an enhanced dose range radio- 82. Sakhalkar H, Sterling D, Adamovics J, Ibbott G, Oldham M. Investiga-
graphic film for intensity-modulated radiation therapy quality assur- tion of the feasibility of relative 3D dosimetry in the Radiologic Physics
ance. Med Phys. 2002;29:2159. Center Head and Neck IMRT phantom using Presage/optical-CT. Med
60. Dogan N, Leybovich LB, Sethi A. Comparative evaluation of Kodak Phys. 2009;36:3371–3377.
EDR2 and XV2 films for verification of intensity modulated radiation 83. Wuu C-S, Xu Y. Three-dimensional dose verification for intensity mod-
therapy. Phys Med Biol. 2002;47:4121. ulated radiation therapy using optical CT based polymer gel dosimetry.
61. Childress NL, Dong L, Rosen II. Rapid radiographic film calibration for Med Phys. 2006;33:1412–1419.
IMRT verification using automated MLC fields. Med Phys. 2002;29:2384. 84. Oldham M, Gluckman G, Kim L. 3D verification of a prostate IMRT
62. Zhu X, Jursinic P, Grimm D, Lopez F, Rownd J, Gillin M. Evaluation treatment by polymer gel-dosimetry and optical-CT scanning. J Phys:
of Kodak EDR2 film for dose verification of intensity modulated radia- Conf Ser. 2004;3:293–296.
tion therapy delivered by a static multileaf collimator. Med Phys. 85. McJury M, Oldham M, Cosgrove VP, et al. Radiation dosimetry using
2002;29:1687. polymer gels: methods and applications. Br J Radiol. 2000;73:919–929.
63. Jursinic PA, Sharma R, Reuter J. MapCHECK used for rotational 86. Letourneau D, Publicover J, Kozelka J, Moseley DJ, Jaffray DA. Novel
IMRT measurements: step-and-shoot, TomoTherapy, RapidArc. Med dosimetric phantom for quality assurance of volumetric modulated arc
Phys. 2010;37:2837–2846. therapy. Med Phys. 2009;36:1813–1821.
64. Masi L, Casamassima F, Doro R, Francescon P. Quality assurance of 87. Feygelman V, Forster K, Opp D, Nilsson G. Evaluation of a biplanar
volumetric modulated arc therapy: evaluation and comparison of differ- diode array dosimeter for quality assurance of step-and-shoot IMRT. J
ent dosimetric systems. Med Phys. 2011;38:612–621. Appl Clin Med Phys. 2009;10:3080.
65. Van Esch A, Clermont C, Devillers M, Iori M, Huyskens DP. On-line 88. Yan G, Lu B, Kozelka J, Liu C, Li JG. Calibration of a novel four-
quality assurance of rotational radiotherapy treatment delivery by dimensional diode array. Med Phys. 2010;37:108–115.
means of a 2D ion chamber array and the Octavius phantom. Med Phys. 89. Feygelman V, Zhang G, Stevens C, Nelms BE. Evaluation of a new
2007;34:3825–3837. VMAT QA device, or the “X” and “O” array geometries. J Appl Clin
66. Childress NL, White RA, Bloch C, Salehpour M, Dong L, Rosen II. Med Phys. 2011;12:3346.
Retrospective analysis of 2D patient-specific IMRT verifications. Med 90. Kozelka J, Robinson J, Nelms B, Zhang G, Savitskij D, Feygelman V.
Phys. 2005;32:838–850. Optimizing the accuracy of a helical diode array dosimeter: a compre-
67. Kruse JJ. On the insensitivity of single field planar dosimetry to IMRT hensive calibration methodology coupled with a novel virtual incli-
inaccuracies. Med Phys. 2010;37:2516–2524. nometer. Med Phys. 2011;38:5021–5032.
68. Stasi M, Bresciani S, Miranti A, Maggio A, Sapino V, Gabriele P. Pre- 91. Boggula R, Lorenz F, Mueller L, et al. Experimental validation of a
treatment patient-specific IMRT quality assurance: a correlation study commercial 3D dose verification system for intensity-modulated arc
between gamma index and patient clinical dose volume histogram. Med therapies. Phys Med Biol. 2010;55:5619–5633.
Phys. 2012;39:7626–7634. 92. Renner WD, Norton KJ, Holmes TW. A method for deconvolution of
69. Carrasco P, Jornet N, Latorre A, Eudaldo T, Ruiz A, Ribas M. 3D integrated electronic portal images to obtain fluence for dose recon-
DVH-based metric analysis versus per-beam planar analysis in IMRT struction. J Appl Clin Med Phys 2005;6:22–39.
pretreatment verification. Med Phys. 2012;39:5040–5049. 93. Wu C, Hosier KE, Beck KE, et al. On using 3D c-analysis for
70. Zhen H, Nelms BE, Tome WA. Moving from gamma passing rates to IMRT and VMAT pretreatment plan QA. Med Phys. 2012;39:3051–
patient DVH-based QA metrics in pretreatment dose QA. Med Phys. 3059.
2011;38:5477–5489. 94. Van Esch A, Depuydt T, Huyskens DP. The use of an aSi-based EPID
71. Podesta M, Nijsten SM, Persoon LC, Scheib SG, Baltes C, Verhaegen for routine absolute dosimetric pre-treatment verification of dynamic
F. Time dependent pre-treatment EPID dosimetry for standard and FFF IMRT fields. Radiother Oncol. 2004;71:223–234.
VMAT. Phys Med Biol. 2014;59:4749–4768. 95. Nakaguchi Y, Araki F, Maruyama M, Saiga S. Dose verification of
72. Pai S, Das IJ, Dempsey JF, et al. TG-69: radiographic film for mega- IMRT by use of a COMPASS transmission detector. Radiol Phys Tech-
voltage beam dosimetry. Med Phys. 2007;34:2228. nol. 2012;5:63–70.
73. Leybovich LB, Sethi A, Dogan N. Comparison of ionization chambers 96. Olch AJ. Evaluation of the accuracy of 3DVH software estimates of
of various volumes for IMRT absolute dose verification. Med Phys. dose to virtual ion chamber and film in composite IMRT QA. Med
2003;30:119–123. Phys. 2012;39:81–86.
74. Fenoglietto P, Laliberte B, Ailleres N, Riou O, Dubois JB, Azria D. 97. Nelms BE, Opp D, Robinson J, et al. VMAT QA: measurement-guided
Eight years of IMRT quality assurance with ionization chambers and 4D dose reconstruction on a patient. Med Phys. 2012;39:4228–4238.
film dosimetry: experience of the montpellier comprehensive cancer 98. Opp D, Nelms BE, Zhang G, Stevens C, Feygelman V. Validation of
center. Radiat Oncol. 2011;6:85. measurement-guided 3D VMAT dose reconstruction on a heterogeneous
75. Dong L, Antolak J, Salehpour M, et al. Patient-specific point dose anthropomorphic phantom. J Appl Clin Med Phys. 2013;14:70–84.
measurement for IMRT monitor unit verification. Int J Radiat Oncol 99. Stathakis S, Myers P, Esquivel C, Mavroidis P, Papanikolaou N.
Biol Phys. 2003;56:867–877. Characterization of a novel 2D array dosimeter for patient-specific qual-
76. Spezi E, Angelini AL, Romani F, Ferri A. Characterization of a 2D ion ity assurance with volumetric arc therapy. Med Phys. 2013;40:0717311–
chamber array for the verification of radiotherapy treatments. Phys Med 0717315.
Biol. 2005;50:3361–3373. 100. Wendling M, Louwe RJW, McDermott LN, Sonke J-J, van Herk M,
77. Chair AN-R, Blackwell CR, Coursey BM, et al. Radiochromic film Mijnheer BJ. Accurate two-dimensional IMRT verification using a
dosimetry: recommendations of AAPM Radiation Therapy Committee back-projection EPID dosimetry method. Med Phys. 2006;33:259–273.
Task Group 55. Med Phys. 1998;25:2093–2115. 101. Wendling M, McDermott LN, Mans A, Sonke JJ, Herk MV, Mijnheer
78. Bogucki T, Murphy W, Baker C, Piazza S, Haus A. Processor quality BJ. A simple backprojection algorithm for 3D in vivo EPID dosimetry
control in laser imaging systems. Med Phys. 1997;24:581. of IMRT treatments. Med Phys 2009;36:3310–3321.
79. Zeidan OA, Stephenson SAL, Meeks SL, et al. Characterization and 102. Both S, Alecu JM, Stan AR, et al. A study to establish reasonable action
use of EBT radiochromic film for IMRT dose verification. Med Phys. limits for patient specific IMRT QA. J Appl Clin Med Phys. 2007;8:1–8.
2006;33:4064–4072. 103. Bailey DW, Nelms BE, Attwood K, Kumaraswamy L, Podgorsak MB.
80. Gum F, Scherer J, Bogner L, Solleder M, Rhein B, Bock M. Prelimi- Statistical variability and confidence intervals for planar dose QA pass
nary study on the use of an inhomogeneous anthropomorphic Fricke rates. Med Phys. 2011;38:6053.
104. Lang S, Reggiori G, Puxeu Vaquee J, et al. Pretreatment quality assur- 125. Mu G, Ludlum E, Xia P. Impact of MLC leaf position errors on simple
ance of flattening filter free beams on 224 patients for intensity modu- and complex IMRT plans for head and neck cancer. Phys Med Biol.
lated plans: a multicentric study. Med Phys 2012;39:1351–1356. 2008;53:77–88.
105. Mancuso GM, Fontenot JD, Gibbons JP, Parker BC. Comparison of 126. Gordon JD, Krafft SP, Jang S, Smith-Raymond L, Stevie MY, Hamil-
action levels for patient-specific quality assurance of intensity modu- ton RJ. Confidence limit variation for a single IMRT system following
lated radiation therapy and volumetric modulated arc therapy treat- the TG119 protocol. Med Phys. 2011;38:1641–1648.
ments. Med Phys. 2012;39:4378–4385. 127. Coleman L, Skourou C. Sensitivity of volumetric modulated arc ther-
106. Bresciani S, Dia AD, Maggio A, et al. Tomotherapy treatment plan apy patient specific QA results to multileaf collimator errors and corre-
quality assurance: the impact of applied criteria on passing rate in lation to dose volume histogram based metrics. Med Phys.
gamma index method. Med Phys. 2013;40:1217111–1217116. 2013;40:1117151–1117157.
107. Olch AJ, Whitaker ML. Validation of a treatment plan-based calibration 128. Fredh A, Scherman JB, Fog LS, Munck af Rosensch€old P. Patient
method for 2D detectors used for treatment delivery quality assurance. QA systems for rotational radiation therapy: a comparative experi-
Med Phys. 2010;37:4485–4494. mental study with intentional errors. Med Phys. 2013;40:0317161–
108. Jursinic PA, Nelms BE. A 2-D diode array and analysis software for 0317169.
verification of intensity modulated radiation therapy delivery. Med 129. Chan MF, Li J, Schupak K, Burman C. Using a novel dose QA tool to
Phys. 2003;30:870–879. quantify the impact of systematic errors otherwise undetected by con-
109. Letourneau D, Gulam M, Yan D, Oldham M, Wong JW. Evaluation of ventional QA methods: clinical head and neck case studies. Technol
a 2D diode array for IMRT quality assurance. Radiother Oncol. Cancer Res Treat. 2014;13:57–67.
2004;70:199–206. 130. Kry SF, Molineu A, Kerns JR, et al. Institutional patient-specific
110. Amerio S, Boriano A, Bourhaleb F, et al. Dosimetric characterization IMRT QA does not predict unacceptable plan delivery. Int J Radiat
of a large area pixel-segmented ionization chamber. Med Phys Oncol Biol Phys 2014;90:1195–1201.
2004;31:414–420. 131. McKenzie EM, Balter PA, Stingo FC, Jones J, Followill DS, Kry SF.
111. Greer PB, Popescu CC. Dosimetric properties of an amorphous silicon Toward optimizing patient-specific IMRT QA techniques in the accu-
electronic portal imaging device for verification of dynamic intensity rate detection of dosimetrically acceptable and unacceptable patient
modulated radiation therapy. Med Phys. 2003;30:1618–1627. plans. Med Phys. 2014;41:121702.
112. Warkentin B, Steciw S, Rathee S, Fallone BG. Dosimetric IMRT verifi- 132. Nelms BE, Chan MF, Jarry G, et al. Evaluating IMRT and VMAT dose
cation with a flat-panel EPID. Med Phys. 2003;30:3143–3155. accuracy: practical examples of failure to detect systematic errors when
113. McDermott LN, Wendling M, van Asselen B, et al. Clinical experience applying a commonly used metric and action levels. Med Phys.
with EPID dosimetry for prostate IMRT pre-treatment dose verifica- 2013;40:1117221–11172215.
tion. Med Phys. 2006;33:3921–3930. 133. Basran PS, Woo MK. An analysis of tolerance levels in IMRT quality
114. Howell RM, Smith IPN, Jarrio CS. Clinical implementation of portal assurance procedures. Med Phys. 2008;35:2300–2307.
dosimetry – Establishing action levels. J Appl Clin Med Phys. 134. Stock M, Kroupa B, Georg D. Interpretation and evaluation of the c
2008;9:16–25. index and the c index angle for the verification of IMRT hybrid plans.
115. Molineu A, Hernandez N, Nguyen T, Ibbott G, Followill D. Credential- Phys Med Biol. 2005;50:399.
ing results from IMRT irradiations of an anthropomorphic head and 135. Budgell GJ, Perrin BA, Mott J, Fairfoul J, Mackay RI. Quantitative
neck phantom. Med Phys. 2013;40:5330–5337. analysis of patient-specific dosimetric IMRT verification. Phys Med
116. Teke T, Bergman AM, Kwa W, Gill B, Duzenli C, Popescu IA. Monte Biol. 2005;50:103.
Carlo based, patient-specific RapidArc QA using Linac log files. Med 136. van Zijtveld M, Dirkx MLP, de Boer HCJ, Heijmen BJM. Dosimetric
Phys. 2010;37:116–123. pre-treatment verification of IMRT using an EPID; clinical experience.
117. Chandraraj V, Stathakis S, Manickam R, Esquivel C, Supe SS, Radiother Oncol. 2006;81:168–175.
Papanikolaou N. Comparison of four commercial devices for RapidArc 137. De Martin E, Fiorino C, Broggi S, et al. Agreement criteria between
and sliding window IMRT QA. J Appl Clin Med Phys. 2011;12:3367. expected and measured field fluences in IMRT of head and neck can-
118. Feygelman V, Zhang G, Stevens C. Initial dosimetric evaluation of cer: the importance and use of the c histograms statistical analysis.
SmartArc - a novel VMAT treatment planning module implemented in Radiother Oncol. 2007;85:399–406.
a multi-vendor delivery chain. J Appl Clin Med Phys. 2010;11:3169. 138. Carlone M, Cruje C, Rangel A, McCabe R, Nielsen M, MacPherson
119. Langen KM, Papanikolaou N, Balog J, et al. ; A. T. Group. QA for M. ROC analysis in patient specific quality assurance. Med Phys.
helical tomotherapy: report of the AAPM Task Group 148. Med Phys. 2013;40:0421031–0421037.
2010;37:4817–4853. 139. Pulliam KB, Huang JY, Howell RM, et al. Comparison of 2D and 3D
120. Bailat CJ, Buchillier T, Pachoud M, Moeckli R, Bochud FO. An abso- gamma analyses. Med Phys. 2014;41:021710.
lute dose determination of helical tomotherapy accelerator, TomoTher- 140. Pawlicki T, Whitaker M, Boyer AL. Statistical process control for radio-
apy High-Art II. Med Phys. 2009;36:3891–3896. therapy quality assurance. Med Phys. 2005;32:2777–2786.
121. Broggi S, Cattaneo GM, Molinelli S, et al. Results of a two-year quality 141. Breen SL, Moseley DJ, Zhang B, Sharpe MB. Statistical process
control program for a helical tomotherapy unit. Radiother Oncol. control for IMRT dosimetric verification. Med Phys. 2008;35:4417–
2008;86:231–241. 4425.
122. Goddu SM, Mutic S, Pechenaya OL, et al. Enhanced efficiency in heli- 142. Gerard K, Grandhaye J-P, Marchesi V, Kafrouni H, Husson F, Aletti
cal tomotherapy quality assurance using a custom-designed water- P. A comprehensive analysis of the IMRT dose delivery process
equivalent phantom. Phys Med Biol. 2009;54:5663–5674. using statistical process control (SPC). Med Phys. 2009;36:
123. Geurts M, Gonzalez J, Serrano-Ojeda P. Longitudinal study using a 1275–1285.
diode phantom for helical tomotherapy IMRT QA. Med Phys. 143. Montgomery DC. Introduction to Statistical Process Control, 6th edn.
2009;36:4977–4983. Hoboken: Wiley, 2009.
124. Yan G, Liu C, Simon TA, Peng LC, Fox C, Li JG. On the sensitivity of 144. Sharpe MB, Miller BM, Yan D, Wong JW. Monitor unit settings for
patient-specific IMRT QA to MLC positioning errors. J Appl Clin Med intensity modulated beams delivered using a step-and-shoot approach.
Phys. 2009;10:2915. Med Phys. 2000;27:2719–2725.