Variation in Mammographic Breast Density Assessments Among Radiologists in Clinical Practice
Radiologist-assessed breast density is associated with considerable variation.
An increasing number of states in the USA have mandated informing women with dense breasts of this fact. The aim of this legislation is to give these women a better awareness that mammography may be of limited use to them when it comes to detecting breast cancer, due to the masking risk posed by the dense breast tissue. Additionally, this dense tissue poses an independent risk of breast cancer. As such, these women are encouraged to “talk to their doctor about breast density” and consider adjunctive forms of breast screening to supplement mammography. However, this legislative change has not come without its issues—one of these being the way that breast density is measured. The most common method for achieving this goal is through the use of the Breast Imaging Reporting and Data System (BI-RADS). However, because of its reliance on human classifications, BI-RADS is known to be subjective and show wide variation between and within readers.
Brian Sprague and a number of members of the PROSPR (Population-based Research Optimizing Screening through Personalized Regimens) Consortium examined data from three breast cancer screening research centres (comprising 30 radiology facilities) to review radiologist breast density assessment in the clinic. 83 radiologists reviewed 216 783 mammograms from a total of 145 123 women and give each of them a BI-RADS 4th Edition density score. Of particular interest was the percentage of images that each radiologist examined and deemed to be “dense” (that is, being classified as BI-RADS 3 or 4) as these are the women targeted by the breast density laws.
The outcome was a striking highlight of the inherent subjectivity that BI-RADS carries. Considering the readings of all 83 radiologists, the median percentage of mammograms that were classified as “dense” was 38.7%. However, this was associated with a great deal of variation—one radiologist only considered 6.3% of all mammograms as dense, while another gave this classification to 84.5% of the images they viewed (see Figure 1). One possibility for this discrepancy could be the fact that the patient set seen by each radiologist may vary considerably in terms of their breast density. However, after adjusting for age, race/ethnicity and BMI (all major factors that affect density), the overall results barely changed, suggesting this may not be a sufficient explanation for the variability in BI-RADS scores. Furthermore, longitudinal assessment of a subset of women who had serial images available suggested that it is inter-radiologist variation that is at fault. Of the women whose density was re-assessed by the same radiologist after an average period of 1.2 years, 10% changed in major density classification (going either from “non-dense” to “dense”, or vice versa). However, when it was a different radiologist performing the assessment, 17.2% of women received a different classification—a 72% increase in discrepancy, even though the period between mammograms remained the same.
Figure 1. The distribution of density assessment among 83 radiologists in the PROSPR Consortium. Displayed is the fraction of mammograms (in terms of percentage) that each radiologist assessed as being “dense”. The percentage labels below the plot display the minimum, median and maximum estimation of “dense” mammograms amongst radiologists.
Considering the increasing interest that breast density reporting is receiving, this discrepancy between radiologist readings could illustrate a serious issue with the way that density is commonly measured. The authors point out that “a woman’s likelihood of being told she has dense breasts varies substantially on the basis of which radiologist interprets her mammogram”. An obvious way of avoiding this subjectivity is to use purely objective measures, such as computer algorithms to quantify density. Volpara, as one of such tools, has been tested in a number of peer-reviewed studies that proved its clinical utility and association with breast cancer risk. It thus constitutes a promising alternative to give women more confidence about their breast density assessment.