Learning from unbalanced data: A cascade-based approach for detecting clustered microcalcifications

A. Bria, N. Karssemeijer and F. Tortorella

Medical Image Analysis 2013;18(2):241-252



Finding abnormalities in diagnostic images is a difficult task even for expert radiologists because the normal tissue locations largely outnumber those with suspicious signs which may thus be missed or incorrectly interpreted. For the same reason the design of a Computer-Aided Detection (CADe) system is very complex because the large predominance of normal samples in the training data may hamper the ability of the classifier to recognize the abnormalities on the images. In this paper we present a novel approach for computer-aided detection which faces the class imbalance with a cascade of boosting classifiers where each node is trained by a learning algorithm based on ranking instead of classification error. Such approach is used to design a system (CasCADe) for the automated detection of clustered microcalcifications (AZA 1/4 Cs), which is a severely unbalanced classification problem because of the vast majority of image locations where no AZA 1/4 C is present. The proposed approach was evaluated with a dataset of 1599 full-field digital mammograms from 560 cases and compared favorably with the Hologic R2CAD ImageChecker, one of the most widespread commercial CADe systems. In particular, at the same lesion sensitivity of R2CAD (90 on biopsy proven malignant cases, CasCADe and R2CAD detected 0.13 and 0.21 false positives per image (FPpi), respectively (p-value=0.09), whereas at the same FPpi of R2CAD (0.21), CasCADe and R2CAD detected 93% and 90% of true lesions respectively (p-value=0.11) thus showing that CasCADe can compete with high-end CADe commercial systems.