Paper
17 March 2008 Learning from imbalanced data: a comparative study for colon CAD
Author Affiliations +
Abstract
Classification plays an important role in the reduction of false positives in many computer aided detection and diagnosis methods. The difficulty of classifying polyps lies in the variation of possible polyp shapes and sizes and the imbalance between the number of polyp and non-polyp regions available in the training data. CAD schemes for medical applications demand high levels of sensitivity even at the expense of keeping a certain number of false positives. In this paper, we investigate some state-of-the-art solutions to the imbalanced data problem: Synthetic Minority Over-sampling Technique (SMOTE) and weighted Support Vector Machines (SVM). We tested these methods using a diverse database of CT colonography, which included a wide spectrum of dificult cases to detect polyps. We performed several experiments with different combinations of over-sampling techniques on training data. The results demonstrated that SVMs have achieved much better performance over C4.5 with different over-sampling techniques. Also, the results show that weighted SVM without over-sampling can achieve comparable performance in terms of sensitivity and specificity to conventional SVM combined with the over-sampling approach.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xiaoyun Yang, Yalin Zheng, Musib Siddique, and Gareth Beddoe "Learning from imbalanced data: a comparative study for colon CAD", Proc. SPIE 6915, Medical Imaging 2008: Computer-Aided Diagnosis, 69150R (17 March 2008); https://doi.org/10.1117/12.770630
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Colon

Data modeling

Computer aided diagnosis and therapy

Colorectal cancer

Computer aided design

Computed tomography

Image segmentation

Back to Top