Abstract
The risk of breast cancer (BC) overdiagnosis attributed to mammography screening is an unresolved issue, complicated by heterogeneity in the methodology of quantifying its magnitude, and both political and scientific elements surrounding interpretation of the evidence on this phenomenon. Evidence from randomized trials and also from observational studies shows that mammography screening reduces the risk of BC death; similarly, these studies provide sufficient evidence that overdiagnosis represents a serious harm from population breast screening. For both these outcomes of screening, BC mortality reduction and overdiagnosis, estimates of magnitude vary between studies however overdiagnosis estimates are associated with substantial uncertainty. The trade-off between the benefit and the collective harms of BC screening, including false-positives and overdiagnosis, is more finely balanced than initially recognized, however the snapshot of evidence presented on overdiagnosis does not mean that breast screening is worthless. Future efforts should be directed towards (a) ensuring that any changes in the implementation of BC screening optimize the balance between benefit and harms, including assessing how planned or actual changes modify the risk of overdiagnosis; (b) informing women of all the outcomes that may affect them when they participate in screening using well-crafted and balanced information; and (c) investing in research that will help define and reduce the ensuing overtreatment of screen-detected BC.
keywords
Introduction
The history of population mammography screening for breast cancer (BC) spans roughly five decades. Mammography screening efficacy has been demonstrated in randomized controlled trials (RCTs)1–5, and subsequently broadly implemented in many health systems for nearly three decades. Yet, the past decade has witnessed accelerated debate on the ‘invisible’ risk of mammography screening6, overdiagnosis of BC attributed to population breast screening7–12. Overdiagnosis, or overdetection, refers to screen-detected malignancy that would not have progressed to clinical or symptomatic presentation during the individual’s lifetime, and would not have been diagnosed nor caused the individual any harm in the absence of screening. This somewhat contested harm of cancer screening, one that is inherently difficult to quantify, adds to the complexity of the outcomes associated with mammography screening. This review will draw on evidence to address the question forming the title of this paper, namely whether BC overdiagnosis attributed to mammography screening renders population screening worthless. A concise overview of the outcomes of mammography screening introduces relevant context to discuss and understand the implications of overdiagnosis for current and future breast screening practice.
Mammography screening benefit
RCTs of mammography screening
The efficacy of screening mammography, measured as a reduction in BC mortality, has been established in RCTs1–5. A meta-analysis of the RCTs (based on 13-year follow-up) reported by the UK’s Independent Panel showed a relative risk (RR) of 0.80 (95%CI 0.73–0.89) in those invited to screening compared to controls, representing a 20% reduction in BC mortality3. The most recent and comprehensive meta-analyses of the RCTs has been reported by Nelson and colleagues1,13 by age-strata to inform the US Preventive Services Task Force recommendations on breast screening. It showed that screening conferred significant reductions in the relative risk of BC death in women aged 50–59 years (RR 0.86; 95%CI 0.68–0.97) and 60-69 years (RR 0.67; 95%CI 0.54–0.83)1; however screening did not significantly reduce the risk of BC death in women aged 40–49 years (RR 0.92; 95%CI 0.75–1.02) or in those aged 70–74 years (RR 0.80; 95%CI 0.51–1.28) although trial data were relatively sparse for the estimated effect in the 70–74 years age-group1. In absolute terms, these pooled estimates translate to prevention of 2.9 (40–49 years), 7.7 (50–59 years), 21.3 (60–69 years), and 12.5 (70–74 years) BC deaths, per 10, 000 women screened for 10 years1,13. The meta-analysis from Nelson also reported that screening reduced the risk of advanced-stage BC in women aged ≥50 years (RR 0.62; 95%CI 0.46–0.83), but not in those aged 39–49 years (RR 0.86; 95%CI 0.68–0.97) based on a subgroup of the screening RCTs1,13.
Observational studies
Numerous non-randomized studies of various designs have been published to evaluate the effect of mammography screening, supplementing evidence from the RCTs, and potentially having more relevance to contemporary ‘real-world’ population screening. Evidence reviews of observational studies on BC screening14–16 generally arrive to similar ‘overall’ conclusions, including that: (1) although some studies did not show significant reduction in BC deaths in association with screening, the data from observational studies considered together provide evidence that population mammography screening confers benefit generally in keeping with that expected from the pivotal RCTs; (2) the estimated impact of population breast screening varies substantially, partly due to study methodology and partly reflecting true variability in magnitude of effect across countries and programs, but can be summed up as frequently within the range of a 12%–36% relative risk reduction in BC mortality (considering extreme estimates, from no effect to risk reduction exceeding 50%); and (3) studies varied in design, methods (including selection of comparison group), precision and analytic methods, and almost all studies suffered from limitations. Harris17 reported that observational studies quantifying the effect of breast screening generally did not adequately adjust for differences in BC risk, screening technology, or treatments, amongst compared groups. Given that around 50% of the observed reduction in BC mortality is attributed to screening with 50% attributed to therapy18, Harris17 suggests that the estimated effect of breast screening from observational studies is around a 10%–12.5% reduction in mortality.
Mammography screening harms
False-positive recall
False-positive recall, leading to unnecessary testing and biopsy, is the most frequent outcome of mammography screening. Overall recall to assessment, and the frequency of false-positive recall, are highly variable across screening practice and influenced by many factors including the organization of screening delivery and screen-reader experience. False-positive recall is generally higher in younger (than older) women and in women with dense breasts, and is more frequent in annual (than biennial) screening, and in first (than subsequent) rounds of screening. Although false-positive recall is a major harm of screening and has been shown to cause undue anxiety and cancer-specific worry for some women12, it is considered a transient (short-term) psychological harm for falsely recalled women2. However, there may also be considerable financial costs to recalled women where assessment is not funded within organized screening programs. Each time a woman has a screening mammogram, she has roughly around a 3%–12% chance of being recalled for further assessment (depending on the above-described factors) hence repeated regular screening confers a cumulative risk of experiencing a false-positive screen - representative estimates are reported in Table 1.
Interval cancers
Interval BCs are cancers that emerge subsequent to a ‘negative’ screening mammogram and before the next scheduled screen19,20 (and are usually diagnosed when the woman presents with symptoms). Although interval cases are considered as false-negative screens when determining interval cancer rates, retrospective radiological audits classify around 25%–40% of interval BCs as false-negative screens on imaging review19. Whereas some would consider interval cases a ‘harm’ of mammography screening (in that women are falsely reassured), it may be more appropriate to consider these false-negative screens a limitation inherent in any form of testing and not a harm unique to population BC screening. However interval cancers represent a failure of mammography screening to detect biologically-relevant disease.
Radiation exposure
The risk of radiation-induced cancer from mammography is not negligible, however the potential for mortality benefit is generally considered to outweigh the risk of death from radiation-induced BC attributed to mammography screening2. Modelling estimates that the number of deaths due to radiation-induced cancer ranges from 2/100, 000 in women aged 50–59 receiving biennial screening to 11/100, 000 in women aged 40–59 having annual screening12.
Overdiagnosis (overdetection) of BC from population screening
As the evidence on overdiagnosis has accumulated considerably, it is now recognized as the most serious down-side of population breast screening. Because screening effectiveness is realized through detecting cancers at a sufficiently early stage (including detection of in situ malignancy) to confer benefit, and given the well-established biological heterogeneity of BC, it is not surprising that screening yields malignancies that may not have progressed during the individual’s lifetime. The extent that screening causes overdiagnosis is an ‘unresolved’ issue plagued by heterogeneity in many of the elements, both political and scientific, that define and measure and interpret the evidence on this harmful outcome of mammography screening. It may well be that at the present time quantifying the magnitude of BC overdiagnosis is secondary to establishing its implications to real-life health practice and how to address the consequences of overdiagnosis. For this reason, this review provides representative estimates of overdiagnosis from published reviews without attempting to dissect the epidemiological and methodological challenges inherent in estimating screening-related BC overdiagnosis which has been detailed by others3,12,21–25. Put simply, many factors contribute to the variability in reported estimates of BC overdiagnosis attributed to mammography screening3,12,21–25, including but not limited to: the definition of overdiagnosis (what exactly is the rate or proportion being measured) and in particular what constitutes the denominator (for example, whether measured in screened women in long-term follow-up or as a proportion of the cancers diagnosed during the screening phase); whether quantifying overdiagnosis of ductal carcinoma in situ (DCIS) or invasive cancer, or both; basic study methodology for measuring overdiagnosis, for example whether based on methods that directly measure the numerator and denominator, or whether based on models of disease progression; differences in study populations including demographics and differences in underlying BC risk (differences between studies; and differences between groups being compared within each study); timing of measuring overdiagnosis and duration of follow-up post-screening; real differences in screening practice such as screening technology, screening policy and frequency, population coverage and uptake; statistical methods and adjustments and assumptions relating to lead time and disease progression (the latter are not limited to modelling studies); and framing of the extent of overdiagnosis (relative or absolute estimates).
Magnitude of overdiagnosis
In one of the earliest systematic reviews of BC overdiagnosis, Biesheuvel and colleagues22 reported an extremely broad range of overdiagnosis estimates (from none to 62%), and also highlighted that source (primary) studies were prone to biases that may over- or under-estimate the magnitude of BC overdiagnosis. The International Agency for Research on Cancer (IARC) Working Group2 reported that sufficient evidence existed on overdiagnosis (‘BCs that would never have been diagnosed or never caused harm if women had not been screened’) and highlighted the Euroscreen Group’s summary estimate of overdiagnosis of 6.5% (range 1%–10%)2 based on a systematic review of European studies and incorporating adjustment for lead time26,27. The UK Independent Panel on BC screening considered the most reliable evidence on overdiagnosis to be derived from the screening RCTs in which women in the control arm were not offered screening at the end of the trial and where there was sufficient follow-up3; using that approach, the UK Panel noted that there were several definitions and methods to quantify diagnosis, and highlighted two useful approaches for quantifying overdiagnosis from breast screening:
Population perspective: the proportion of all BCs ever diagnosed in women invited to screening that are overdiagnosed (estimated as ranging between 9.7% and 12.4%)3
Woman’s perspective: the probability that a BC diagnosed during the screening period represents an overdiagnosed BC (estimated as ranging between 16.0% and 22.7%)3
In a commentary on BC screening guidelines, Keating and Pace28 noted that for a 40- or 50-year-old woman undergoing annual screening over 10 years, 19% of the BCs diagnosed during that period of screening would not have become clinically apparent in the absence of screening, and that the estimate was associated with uncertainty. In one of the most recent reviews on this topic, Nelson and colleagues12 reported that observational studies using different methods estimated overdiagnosis rates within the range of 0% to 54%, noting both the broad range of published estimates and also the lack of agreement on what constitutes the most appropriate methodology to quantify BC overdiagnosis12. Using a comprehensive overview of overdiagnosis from screening for several cancer types, Carter and colleagues21 provide key information on study quality and the reported estimates of overdiagnosis: estimates for studies in the breast screening context are summarized in Table 2. Importantly, Carter’s overview21 is a step forward in providing insightful interpretation of the evidence to inform development of standards for future studies quantifying and monitoring overdiagnosis.
A useful approach to framing the extent of overdiagnosis is to report it in absolute numbers in relation to the main benefit (prevention of BC death) of screening, and to also express that as a ratio indicative of the ‘trade-off’ between these outcomes. Mandelblatt and colleagues29 used collaborative modelling comprising 6 established simulation models to estimate the cumulative outcomes of screening, and reported the median value across models for each outcome per 1,000 women screened versus no screening: for biennial screening from age 50 to 74 years, 7 (range 4–9) BC deaths are averted and 19 (range 11–34) cases are overdiagnosed; for biennial screening from age 40 to 74 years, 8 (range 5–10) BC deaths are averted and 21 (range 11–34) cases are overdiagnosed29. Across various scenarios for screening frequency and start ages, the data from collaborative modelling consistently showed that the trade-off was that for each BC death averted by screening around 2.5 cases are overdiagnosed29.
Similar ‘trade-off’ was estimated by the UK’s Independent Panel on BC screening which reported that, having evaluated all the available evidence, for each BC death prevented by mammography screening about 3 cases will be overdiagnosed3. The ratio of 1 BC death averted to 3 overdiagnosed cases from the UK’s Panel was calculated by applying estimates for benefit and for overdiagnosis to 10,000 women invited to screening for 20 years from age 50 years: 43 BC deaths would be prevented and 129 cases (of invasive and non-invasive BC) would be overdiagnosed and treated in the UK screening context3. An Australian trial evaluating informed decision-making30, used a similar approach applied to published Australian data31 to estimate that for women having biennial screening over 20 years, for each BC death averted around 4 to 5 cases are overdiagnosed. The Canadian Task Force on Preventive Health Care has provided data for average-risk women aged 50–69 years who are screened biennially for 11 years, indicating an approximate ratio of one BC death prevented by mammography screening to 4 overdiagnosed (and over-treated) cases32. In contrast to the above-reported estimates, the Euroscreen Group derived numbers from European studies to develop a balance sheet for breast screening, reporting that for every 1,000 women screened biennially from age 50–51 (with follow-up to age 79), 7–9 BC deaths are avoided and 4 cases are overdiagnosed27, hence an approximate ratio of 2 BC deaths avoided to 1 overdiagnosed case.
Considering all the above data on the trade-off between the number of averted BC deaths and overdiagnosed cases ( Table 3), it is reasonable to conclude that as many or more women appear to be overdiagnosed (and consequently over-treated) than BC deaths avoided through mammography screening for BC. However there remains much uncertainty around these estimates of the trade-off and a need for more systematic evaluation of the extent of overdiagnosis relative to screening benefit.
Implications of overdiagnosis for screening practice
Reduction or avoidance of BC death is very highly valued, both from the individual and the societal perspective - hence the snapshot of evidence presented on overdiagnosis does not mean that population breast screening is worthless. What it does mean however is that the benefit of BC screening does not necessarily outweigh the harms which are more likely to be experienced by screening participants than avoidance of BC death. In other words, the balance of benefit (primarily mortality reduction) and the various harms from BC screening is a finer balance than initially thought. Therefore, the implications for population breast screening practice relate to three key themes that will underpin the provision of an effective and ethical cancer control strategy through mammography screening in the present and progressing into the future.
The first theme relates to the delicate balance between benefit and harms: efforts should be directed towards maximizing benefit and importantly towards controlling and reducing harms, particularly the harm from overdiagnosis.Figure 1 presents a conceptual framework for optimizing the balance between the benefit and harms of population breast screening; it highlights that potential changes to population breast screening practice, whether related to screening policy (for example, expansion of the age-groups in screened populations) or to screening practice (for example, introduction of new technologies), must carefully determine the extent that any such modification will augment benefit or will add to the harms and specifically whether potential changes will increase overdiagnosis.
The second theme entails that women be informed of all the outcomes that may affect them when they participate in population BC screening. Accurate and balanced age-group specific information on the outcomes of mammography screening, including that of overdiagnosis, must be provided to women to support informed decisions. It is generally recognized that individuals should be well informed of the pros and cons of healthcare interventions when making decisions on the best healthcare for them. However, traditionally, in the context of mammography screening, communication strategies and public health campaigns and messages have largely advocated the importance of having screening and have focused on promoting its benefits33. As outlined earlier in this review, alongside the potential benefit of BC screening there are harms, and both benefit and harms should be communicated to women. Because the issue of overdiagnosis from screening is complex and unfamiliar to most women, it is important to craft and evaluate rigorously developed information on mammography screening that also explains overdiagnosis to potential screening participants. Two Australian RCTs have examined mammography screening decision aids for women aged 40 and 70 and showed that these information aids improved knowledge and reduced the number of women who remained undecided about screening, with the majority of women favoring screening33–35.
More recently, Hersch and colleagues30 conducted a RCT whereby a decision aid containing balanced information on the outcomes of mammography screening, including an explanation of the risk of overdiagnosis, showing that the decision aid increased both knowledge and informed choice in comparison to a control decision aid which omitted the overdiagnosis information. It is noteworthy that the decision aid also contained information explaining to women that once BC is found on screening, treatment is recommended because current knowledge cannot identify which BCs will be harmful and will progress if untreated, and which BCs may not be harmful30. Although that study also reported that significantly fewer women in the intervention arm intended to screen and some were undecided about whether they will screen, the majority of women in both arms of the RCT still intended to have BC screening. This approach, adapted to local screening contexts, may be a practical and appropriate means of supporting women to make an informed choice regarding whether or not to have mammography screening.
The third theme relates to overtreatment that is consequent to overdiagnosis36. Given that we cannot yet identify which cancers are overdiagnosed through screening, and given that a substantial proportion of BC patients will have screen-detected cancer, research efforts need to be directed towards defining and addressing the burden of overtreatment. Existing research that has deciphered tumor behavior through molecular profiles, complemented by gene expression testing for therapy selection, has already advanced the era of precision medicine in BC. Future efforts will need to be dedicated to investigating the extent that these advances can elucidate the biological behavior of early-stage screen-detected BC to minimize overtreatment in the future36. Consideration of overtreatment brings about research needs and opportunities that extend beyond screen-detected BC, recognizing the broader implications for treatment of early-stage disease due to enhanced BC awareness and use of adjunct technologies, all of which increasingly result in women receiving surgery and adjuvant therapies for very small or in situ cancer, hence the relevance of overtreatment is not limited to screening mammography-detected BC.
Conclusions
The magnitude of BC overdiagnosis attributed to mammography screening is uncertain and complicated by heterogeneity in many of the elements, political and scientific, that define and interpret the evidence on this screening harm; however there is sufficient evidence to acknowledge overdiagnosis as a serious harm from population BC screening. Based on the available evidence, it is reasonable to conclude that mammography screening reduces the risk of BC death but the trade-off between this highly-valued benefit, and the harms including false-positives and overdiagnosis, is finely balanced. The snapshot of evidence presented on overdiagnosis in this review, however, does not mean that population breast screening is worthless, given that screening reduces BC deaths. Hence efforts should be directed towards controlling and minimizing the harmful consequences associated with BC screening, including ensuring that any changes in breast screening implementation optimize the balance between benefit and harms (including assessing how changes impact the risk of overdiagnosis), and informing women of all the outcomes that may affect them when they participate in screening. Future investments in BC screening and treatment research will also be necessary to help define and reduce the ensuing overtreatment of early-stage BC.
Footnotes
Conflict of interest statement No potential conflicts of interest are disclosed.
- Received June 16, 2016.
- Accepted July 22, 2016.
- Copyright: © 2017, Cancer Biology & Medicine
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY) 4.0, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.