Considering the rapid rise in breast cancer incidence in China and lack of calibrated breast cancer prediction models for the Chinese female population, developing a breast cancer model targeting the Chinese women is necessary. This study aimed at generating a breast cancer risk prediction model for Chinese women. A total of 1079 (85 images contralateral to a cancer and 994 cases without breast cancer) women were recruited from Fudan University Shanghai Cancer Centre. For each case, we collected sixteen demographic variables such as age, BMI, number of children, family history of breast cancer, and age at menarche. Moreover, the dense tissue was automatically segmented by AutoDensity. A set of quantitative features were extracted from the dense area. Using the 80th percentile of intensity values in the dense area, the segmented area was thresholded again and the second set of computer-extracted features was calculated. The features, i.e. the demographic variables, and texture features extracted from the mammographically dense areas of the image, have been fed into an ensemble of 250 decision trees, whose results were combined using RUSBoost. The classifier achieved an AUC of 0.88 (CI: 0.84 - 0.91) for identifying high-risk images. Therefore, adopting such model might lead to the augmentation of discriminatory power of currently-used risk prediction models. However, it should be noted that the cancer cases were retrieved from the diagnostic environment (not screening) and further validation on a dataset from a screening set-up will be required.
|