Open Access
22 November 2019 Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning
Author Affiliations +
Funded by: Food and Drug Administration
Abstract

We evaluated whether using synthetic mammograms for training data augmentation may reduce the effects of overfitting and increase the performance of a deep learning algorithm for breast mass detection. Synthetic mammograms were generated using in silico procedural analytic breast and breast mass modeling algorithms followed by simulated x-ray projections of the breast models into mammographic images. In silico breast phantoms containing masses were modeled across the four BI-RADS breast density categories, and the masses were modeled with different sizes, shapes, and margins. A Monte Carlo-based x-ray transport simulation code, MC-GPU, was used to project the three-dimensional phantoms into realistic synthetic mammograms. 2000 mammograms with 2522 masses were generated to augment a real data set during training. From the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) data set, we used 1111 mammograms (1198 masses) for training, 120 mammograms (120 masses) for validation, and 361 mammograms (378 masses) for testing. We used faster R-CNN for our deep learning network with pretraining from ImageNet using the Resnet-101 architecture. We compared the detection performance when the network was trained using different percentages of the real CBIS-DDSM training set (100%, 50%, and 25%), and when these subsets of the training set were augmented with 250, 500, 1000, and 2000 synthetic mammograms. Free-response receiver operating characteristic (FROC) analysis was performed to compare performance with and without the synthetic mammograms. We generally observed an improved test FROC curve when training with the synthetic images compared to training without them, and the amount of improvement depended on the number of real and synthetic images used in training. Our study shows that enlarging the training data with synthetic samples can increase the performance of deep learning systems.

© 2019 Society of Photo-Optical Instrumentation Engineers (SPIE) 2329-4302/2019/$28.00 © 2019 SPIE
Kenny H. Cha, Nicholas A. Petrick, Aria X. Pezeshk, Christian G. Graff, Diksha Sharma, Andreu Badal, and Berkman Sahiner "Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning," Journal of Medical Imaging 7(1), 012703 (22 November 2019). https://doi.org/10.1117/1.JMI.7.1.012703
Received: 26 June 2019; Accepted: 4 September 2019; Published: 22 November 2019
Lens.org Logo
CITATIONS
Cited by 27 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Mammography

Breast

Digital mammography

3D modeling

Data modeling

Monte Carlo methods

Detection and tracking algorithms

Back to Top