In this study we used a large previously built database of 2,892 mammograms and 31,650 single mammogram radiologists’ assessments to simulate the impact of replacing one radiologist by an AI system in a double reading setting. The double human reading scenario and the double hybrid reading scenario (second reader replaced by an AI system) were simulated via bootstrapping using different combinations of mammograms and radiologists from the database. The main outcomes of each scenario were sensitivity, specificity and workload (number of necessary readings). The results showed that when using AI as a second reader, workload can be reduced by 44%, sensitivity remains similar (difference -0.1%; 95% CI = - 4.1%, 3.9%), and specificity increases by 5.3% (P<0.001). Our results suggest that using AI as a second reader in a double reading setting as in screening programs could be a strategy to reduce workload and false positive recalls without affecting sensitivity.
The purpose of the study is to test the performance of the combination of digital breast tomosynthesis (DBT) and synthetic views on the detection for cancers presenting as calcifications compared to the performance of planar mammography combined with DBT. A pilot study is presented. A set of 22 cases without cancer were collected from a Siemens Inspiration mammography system. Twenty-two simulated calcification clusters were inserted into the planar and DBT projections of 16 cases. For each case one breast and one view were used. The images were processed using Siemens proprietary software. Seven experienced mammography readers viewed the cases in three study arms: planar alone (ArmP), planar with DBT (ArmP&D) and synthetic 2D with DBT (ArmS&D). The observers marked the suspected location of the clusters and classified the likelihood of there being a suspicious calcification clusters for each case. A JAFROC figure of merit (FoM) was calculated for each study arm. The detection fractions of all cases were 46±16% (P and P&D), 34±19% (S&D). For lesion marked for recall then the maximum detection rate was 19%. The FoMs were 0.48±0.15 (P) and 0.42±0.17 (P&D), but significantly lower (p≤0.003) for S&D (0.32±0.16). This pilot study demonstrated the feasibility of undertaking a larger study. The overall detection were lower (<50%) than optimal for a virtual clinical trial. We plan to increase the detection rate by using less subtle clusters in the final study. When using synthetic 2D images instead of planar images alongside DBT, the FoM was lower for subtle calcification clusters.
Purpose: To measure changes in breast density in a screening population. Method: Unprocessed mammograms were collected for 8,268 women (6034 and 2234 women with two and three sequential screening rounds respectively) with normal breasts (routine recall), from the OPTIMAM image database. The volumetric breast density (VBD), fibroglandular volume (FGV) and breast volume (BV) were determined and the changes between screening rounds calculated. Linear regression determined if the rate of change in these breast density measures varied significantly with age at initial screen. The women were split into four quartiles according to VBD in both screening rounds, and any changes in the quartile allocation of each woman determined. The VBD for these women was compared to our previously published data for women with screen detected and interval cancers. Results: Averaged over all women, the percentage change in VBD, FGV and BV, over 6 years was -11.2% (95%CI: - 12.2% to -10.2%), -5.3% (95%CI: -6.1% to -4.5%) and 11.5% (95%CI: 10.4% to 12.6%) respectively. The percentage change per month, of VBD, FGV and BD decreased significantly with age (p<0.0001). The percentage change in FGV was more strongly associated with the FGV at initial screen than age. For the least and most dense quartiles (who would be of interest for risk stratification), the majority of women did not change quartile between screening rounds. VBD was higher for women who developed interval cancers. Conclusions: The average VBD decreased by 11% over six years. The majority (~80%) of women do not change quartile of VBD in six years.
The use of conventional clinical trials to optimise technology and techniques in breast cancer screening carries with it issues of dose, high cost and delay. This has motivated the development of Virtual Clinical Trials (VCTs) as an alternative in-silico assessment paradigm. However, such an approach requires a set of modelling tools that can realistically represent the key biological and technical components within the imaging chain. The OPTIMAM image simulation toolbox provides a complete validated end-to-end solution for VCTs, wherein commonly-found regular and irregular lesions can be successfully and realistically simulated. As spiculated lesions are the second most common form of solid mass we report on our latest developments to produce realistic spiculated lesion models, with particular application in Alternative Forced Choice trials. We make use of sets of spicules drawn using manually annotated landmarks and interpolated by a fitted 3D spline for each spicule. Once combined with a solid core, these are inserted into 2D and tomosynthesis image segments and blended using a combination of elongation, rotational alignment with background, spicule twisting and core radial contraction effects. A mixture of real and simulated images (86 2D and 86 DBT images) with spiculated lesions were presented to an experienced radiologist in an observer study. The latest observer study results demonstrated that 88.4% of simulated images of lesions in 2D and 67.4% of simulated lesions in DBT were rated as definitely or probably real on a six-point scale. This presents a significant improvement on our previous work which did not employ any background blending algorithms to simulate spiculated lesions in clinical images.
KEYWORDS: Cancer, Databases, Breast, Medical imaging, Image processing, Breast cancer, Inspection, Picture Archiving and Communication System, Mammography, Medical research
Reviewing interval cancers and prior screening mammograms are a key measure to monitor screening performance. Radiological analysis of the imaging features in prior mammograms and retrospective classification are an important educational tool for readers to improve individual performance.
The requirements of remote, collaborative image review sessions, such as those required to run a remote interval cancer review, are variable and demand a flexible and configurable software solution that is not currently available on commercial workstations. The wide range of requirements for both collection and remote review of interval cancers has precipitated the creation of extensible medical image viewers and accompanying systems.
In order to allow remote viewing, an application has been designed to allow workstation-independent, PACS-less viewing and interaction with medical images in a remote, collaborative manner, providing centralised reporting and web-based feedback. A semi-automated process, which allows the centralisation of interval cancer cases, has been developed. This stand-alone, flexible image collection toolkit provides the extremely important function of bespoke, ad-hoc image collection at sites where there is no dedicated hardware.
Web interfaces have been created which allow a national or regional administrator to organise, coordinate and administer interval cancer review sessions and deploy invites to session members to participate. The same interface allows feedback to be analysed and distributed.
The eICR provides a uniform process for classifying interval cancers across the NHSBSP, which facilitates rapid access to a robust 'external' review for patients and their relatives seeking answers about why their cancer was 'missed'.
Background: The vigilance decrement and prevalence effect both describe changes to speed and accuracy with time on task. Whilst there is much laboratory based research on these effects, little is known about whether they occur in real world mammography practice. Methods: The Changing Case Order to Optimise Patterns of Performance in Screening (CO-OPS) trial randomised 37,724 batches containing 1.2 million women attending breast screening to intervention or control (222,208 from the Midlands of England). In the control arm the batch was examined in the same order by both readers, in the intervention arm it was examined in a different order by both readers. Time taken, recall decision by both readers, and cancers detected were recorded for each case, and used to examine patterns of performance with time on task. Results: 49,575 women were recalled and 10,484 had cancer detected. Median time taken to examine each case was 35 seconds (out of cases where time taken was 10 minutes or less). The intervention did not affect overall cancer detection rates or recall rates. A more detailed analysis of the Midlands data indicates cancer detection rate did not change when reading up to 60 cases in a batch, but recall rate reduced. Time taken per case reduced with time on task, from a median 41 seconds when examining the second case in the batch to 28.5 seconds examining the 60th case. Conclusion: Reader behavior and performance systematically changes with time on task in breast screening.
It has previously been shown that 2D spectral mammography can be used to discriminate between (likely benign) cystic and (potentially malignant) solid lesions in order to reduce unnecessary recalls in mammography. One limitation of the technique is, however, that the composition of overlapping tissue needs to be interpolated from a region surrounding the lesion. The purpose of this investigation was to demonstrate that lesion characterization can be done with spectral tomosynthesis, and to investigate whether the 3D information available in tomosynthesis can reduce the uncertainty from the interpolation of surrounding tissue. A phantom experiment was designed to simulate a cyst and a tumor, where the tumor was overlaid with a structure that made it mimic a cyst. In 2D, the two targets appeared similar in composition, whereas spectral tomosynthesis revealed the exact compositional difference. However, the loss of discrimination signal due to spread from the plane of interest was of the same strength as the reduction of anatomical noise. Results from a preliminary investigation on clinical tomosynthesis images of solid lesions yielded results that were consistent with the phantom experiments, but were still to some extent inconclusive. We conclude that lesion characterization is feasible in spectral tomosynthesis, but more data, as well as refinement of the calibration and discrimination algorithms, are needed to draw final conclusions about the benefit compared to 2D.
The impact of image processing on cancer detection is still a concern to radiologists and physicists. This work aims to evaluate the effect of two types of image processing on cancer detection in mammography. An observer study was performed in which six radiologists inspected 349 cases (a mixture of normal cases, benign lesions and cancers) processed with two types of image processing. The observers marked areas they were suspicious were cancers. JAFROC analysis was performed to determine if there was a significant difference in cancer detection between the two types of image processing. Cancer detection was significantly better with the standard setting image processing (flavor A) compared with one that provides enhanced image contrast (flavor B), p = 0.036. The image processing was applied to images of the CDMAM test object, which were then analysed using CDCOM. The threshold gold thickness measured with the CDMAM test object was thinner using flavor A than flavor B image processing. Since Flavor A was found to be superior in both the observer study and the measurements using the CDMAM phantom, this may indicate that measurements using the CDMAM correlate with change in cancer detection with different types of image processing.
The development of new x-ray imaging techniques often requires prior knowledge of tissue attenuation, but the sources of such information are sparse. We have measured the attenuation of adipose breast tissue using spectral imaging, in vitro and in vivo. For the in-vitro measurement, fixed samples of adipose breast tissue were imaged on a spectral mammography system, and the energy-dependent x-ray attenuation was measured in terms of equivalent thicknesses of aluminum and poly-methyl methacrylate (PMMA). For the in-vivo measurement, a similar procedure was applied on a number of spectral screening mammograms. The results of the two measurements agreed well and were consistent with published attenuation data and with measurements on tissue-equivalent material.
Introduction: The effect that the image quality associated with different image receptors has on cancer detection in mammography was measured using a novel method for changing the appearance of images. Method: A set of 270 mammography cases (one view, both breasts) was acquired using five Hologic Selenia and two Hologic Dimensions X-ray sets: 160 normal cases, 80 cases with subtle real non-calcification malignant lesions and 30 cases with biopsy proven benign lesions. Simulated calcification clusters were inserted into half of the normal cases. The 270 cases (Arm 1) were converted to appear as if they had been acquired on three other imaging systems: caesium iodide detector (Arm 2), needle image plate computed radiography (CR) (Arm 3) and powder phosphor CR (Arm 4). Five experienced mammography readers marked the location of suspected cancers in the images and classified the degree of visibility of the lesions. Statistical analysis was performed using JAFROC. Results: The differences in the visibility of calcification clusters between all pairs of arms were statistically significant (p<0.05), except between Arms 1 and 2. The difference in the visibility of non-calcification lesions was smaller than for calcification clusters, but the differences were still significant except between Arms 1 and 2 and between Arms 3 and 4. Conclusion: Detector type had a significant impact on the visibility of all types of subtle cancers, with the largest impact being on the visibility of calcification clusters.
Image processing (IP) is the last step in the digital mammography imaging chain before interpretation by a radiologist. Each manufacturer has their own IP algorithm(s) and the appearance of an image after IP can vary greatly depending upon the algorithm and version used. It is unclear whether these differences can affect cancer detection. This work investigates the effect of IP on the detection of non-calcification cancers by expert observers. Digital mammography images for 190 patients were collected from two screening sites using Hologic amorphous selenium detectors. Eighty of these cases contained non-calcification cancers. The images were processed using three versions of IP from Hologic – default (full enhancement), low contrast (intermediate enhancement) and pseudo screen-film (no enhancement). Seven experienced observers inspected the images and marked the location of regions suspected to be non-calcification cancers assigning a score for likelihood of malignancy. This data was analysed using JAFROC analysis. The observers also scored the clinical interpretation of the entire case using the BSBR classification scale. This was analysed using ROC analysis. The breast density in the region surrounding each cancer and the number of times each cancer was detected were calculated. IP did not have a significant effect on the radiologists’ judgment of the likelihood of malignancy of individual lesions or their clinical interpretation of the entire case. No correlation was found between number of times each cancer was detected and the density of breast tissue surrounding that cancer.
Knowledge of x-ray attenuation is essential for developing and evaluating x-ray imaging technologies. For instance,
techniques to better characterize cysts at mammography screening would be highly desirable to reduce recalls, but
the development is hampered by the lack of attenuation data for cysts. We have developed a method to measure xray
attenuation of tissue samples using a prototype photon-counting spectral mammography unit. Spectral (energyresolved)
images were acquired and the image signal was mapped to two known reference materials, which were
used to derive the x-ray attenuation as a function of energy. We have measured the attenuation of 25 samples of
breast cyst fluid. Spectral measurements of water samples showed consistent results compared to published
attenuation values.
European Guidelines for quality control in digital mammography specify acceptable and achievable standards of image
quality (IQ) in terms of threshold gold thickness using the CDMAM test object. However, there is little evidence relating
such measurements to cancer detection. This work investigated the relationship between calcification detection and
threshold gold thickness. An observer study was performed using a set of 162 amorphous selenium direct digital (DR)
detector images (81 no cancer and 81 with 1-3 inserted calcification clusters). From these images four additional IQs
were simulated: different digital detectors (computed radiography (CR) and DR) and dose levels. Seven observers
marked and rated the locations of suspicious regions. DBM analysis of variances was performed on the JAFROC figure
of merit (FoM) yielding 95% confidence intervals for IQ pairs. Automated threshold gold thickness (Tg) analysis was
performed for the 0.25mm gold disc diameter on CDMAM images at the same IQs (16 images per IQ). Tg was plotted
against FoM and a power law fitted to the data. There was a significant reduction in FoM for calcification detection for
CR images compared with DR; FoM decreased from 0.83 to 0.63 (p≤0.0001). Detection was also sensitive to dose.
There was a good correlation between FoM and Tg (R2=0.80, p<0.05), consequently threshold gold thickness was a good
predictor of calcification detection at the same IQ. Since the majority of threshold gold thicknesses for the various IQs
were above the acceptable standard despite large variations in calcification detection by radiologists, current EU
guidelines may need revising.
KEYWORDS: Diagnostics, Mammography, Digital mammography, Breast, Statistical analysis, Cancer, Digital breast tomosynthesis, Roads, Medical imaging, Imaging systems
Purpose: To investigate whether readers' experience affects performance in a study comparing 2D digital
mammography (2D) with 2-view (CC and MLO) or 1-view (MLO) tomosynthesis.
Materials and Methods: One-hundred-thirty 2D cases were collected from screening assessment and referral clinics; 64
of the cases had verified abnormalities and the remaining were confirmed normal. Two-view tomosynthesis images were
obtained from the same patients. Ten accredited readers (5 with ≥ 10 years experience in mammography and 5 with < 10
years) classified the cases in terms of malignancy (rate 0-5), and recall (yes/no), for both modalities. A second
experiment was performed with the same cases, with 10 other readers (again 5 experienced / 5 less experienced), but
using 2D and 1-view tomosynthesis as the two modalities. The multi-reader-multi-case ROC method was applied and the
significance of diagnostic accuracy difference of 2D vs tomosynthesis was calculated, as a function of experience and for
each experiment. Recall rate (RR) on malignant and benign cases was also calculated, along with reading time.
Results: No significant difference was reached between 2D and 2-view tomosynthesis for experienced readers (pvalue=
0.25); for less experienced readers the p-value was significant (0.03). No significant difference was found
between 2D and 1-view tomosynthesis, independent of readers' experience. RR for benign cases decreased for
tomosynthesis (for booth 1- and 2-view), independent of experience. Average reading time per case was 79 s (range 65-
91 s) and 134 s (range 119-158 s) for experienced readers; 56 s (range 46-67 s) and 115s (range 97-142 s) for nonexperienced,
for 2D and 2-view tomosynthesis respectively. Reading time was 74 s (range 43-98 s) and 99 s (range 73-
117 s) for experienced readers; 74 s (range 62-85 s) and 94 s (range 82-137 s) for non-experienced, for 2D and 1-view
tomosynthesis respectively.
Conclusions: For experienced readers, there is no evidence of improved diagnostic accuracy when using 2-view or 1-
view tomosynthesis, while less experienced readers perform better with 2-view tomosynthesis than 2D images.
Tomosynthesis reduces the number of recall of benign cases, without hindering cancer detection.
The purpose of this study was to measure how mammography readers' performance varies with time of day and time
spent reading. This was investigated in screening practice and when reading an enriched case set. In screening practice
records of time and date that each case was read, along with outcome (whether the woman was recalled for further tests,
and biopsy results where performed) was extracted from records from one breast screening centre in UK (4 readers).
Patterns of performance with time spent reading was also measured using an enriched test set (160 cases, 41% malignant,
read three times by eight radiologists). Recall rates varied with time of day, with different patterns for each reader. Recall
rates decreased as the reading session progressed both when reading the enriched test set and in screening practice.
Further work is needed to expand this work to a greater number of breast screening centres, and to determine whether
these patterns of performance over time can be used to optimize overall performance.
Receiver Operating Characteristic analysis provides a reliable and cost effective performance measurement tool, without
using full clinical trials. However, when ROC analysis shows that performance is statistically superior in one condition
than another it is difficult to relate this result to effects in practice, or even to determine whether it is clinically
significant. In this paper we present two concurrent analyses: using ROC methods alongside single threshold recall rate
data, and suggest that reporting both provides complimentary data. Four mammographers read 160 difficult cases (41%
malignant) twice, with and without prior mammograms. Lesion location and probability of malignancy was reported for
each case and analyzed using JAFROC. Concurrently each participant chose recall or return to screen for each case.
JAFROC analysis showed that the presence of prior mammograms improved performance (p<.05). Single threshold data
showed a trend towards a 26% increase in the number of false positive recalls without prior mammograms (p=.056). If
this trend were present throughout the NHS Breast Screening Programme then discarding prior mammograms would
correspond to an increase in recall rate from 4.6% to 5.3%, and 12,414 extra women recalled annually for assessment.
Whilst ROC methods account for all possible thresholds of recall and have higher power, providing a single threshold
example of false positive, false negative, and recall rates when reporting results could be more influential for clinicians.
This paper discusses whether this is a useful additional method of presenting data, or whether it is misleading and
inaccurate.
After the introduction of digital mammography the film mammograms from the previous screening round (the prior
mammograms) can be displayed in a variety of ways. This paper investigates the performance of radiologists reading
digital screening mammograms with the prior mammograms displayed either as film or in digitised format. A set of 162
cases was assembled, each with two view digital mammograms and two view film prior mammograms. Of these cases 66
were malignant as proven by biopsy, and the others were normal or benign. The film prior mammograms were digitised
at 75μm. Eight participants, with four to seventeen years experience of reading screening mammograms, each read the
mammograms twice; once with the digitised prior mammograms displayed on the digital workstation, and once with the
film prior mammograms displayed on an adjacent multi-viewer. The two viewings were at least one month apart.
Participants marked the location of abnormalities on a paper copy of the mammograms and rated the probability of
malignancy of each abnormality. Participants were video-taped whilst reading the cases to enable analysis of gross eye
movements for information regarding the level of use of the prior mammograms. JAFROC analysis showed no
difference in performance between the conditions.
In the UK Breast Screening Programme there is a growing transition from film to digital mammography, and
consequently a change in mammography workstation ergonomics. This paper investigates the effect of the change for
radiologists including their comfort, likelihood of developing musculoskeletal disorders (MSD's), and work practices.
Three workstations types were investigated: one with all film mammograms; one with digital mammograms alongside
film mammograms from the previous screening round, and one with digital mammograms alongside digitised film
mammograms from the previous screening round. Mammographers were video-taped whilst conducting work sessions at
each of the workstations. Event based Rapid Upper Limb Assessment (RULA) postural analysis showed no overall
increase in MSD risk level in the switch from the film to digital workstation. Average number of visual glances at the
prior mammograms per case measured by analysis of recorded video footage showed an increase if the prior
mammograms were digitised, rather than displayed on a multi-viewer (p<.05). This finding has potential implications for
mammographer performance in the transition to digital mammography in the UK.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.