Histopathology images involve the analysis of tissue samples to diagnose several diseases, such as cancer. The analysis of tissue samples is a time-consuming procedure, manually made by medical experts, namely pathologists. Computational pathology aims to develop automatic methods to analyze Whole Slide Images (WSI), which are digitized histopathology images, showing accurate performance in terms of image analysis. Although the amount of available WSIs is increasing, the capacity of medical experts to manually analyze samples is not expanding proportionally. This paper presents a full automatic pipeline to classify lung cancer WSIs, considering four classes: Small Cell Lung Cancer (SCLC), non-small cell lung cancer divided into LUng ADenocarcinoma (LUAD) and LUng Squamous cell Carcinoma (LUSC), and normal tissue. The pipeline includes a self-supervised algorithm for pre-training the model and Multiple Instance Learning (MIL) for WSI classification. The model is trained with 2,226 WSIs and it obtains an AUC of 0.8558 ± 0.0051 and a weighted f1-score of 0.6537 ± 0.0237 for the 4-class classification on the test set. The capability of the model to generalize was evaluated by testing it on the public The Cancer Genome Atlas (TCGA) dataset on LUAD and LUSC classification. In this task, the model obtained an AUC of 0.9433 ± 0.0198 and a weighted f1-score of 0.7726 ± 0.0438.
With a prevalence of 1-2% Celiac Disease (CD) is one of the most commonly known genetic and autoimmune diseases, which is induced by the intake of gluten in genetically predisposed persons. Diagnosing CD involves the analysis of duodenum biopsies to determine the small intestine condition. In this study, we propose a singlescale pipeline and the combination of two single-scale pipelines, forming a multi-scale approach, to accurately classify CD signs in histopathology whole slide images with automatically generated labels. The automatic classification of CD signs in histopathological images of these biopsies has not been extensively studied, resulting in the absence of a standardized guidelines or best-practices for this purpose. To fill this gap, we evaluated different magnifications and architectures, including a pre-trained MoCov2 model, for both single- and multiscale approaches. Furthermore, for the multi-scale approach, methods for aggregating feature vectors from several magnifications are explored. For the single-scale pipeline we achieved an AUC of 0.9975 and a weighted F1-score of 0.9680, while for the multiscale Pipeline an AUC of 0.9966 and a weighted F1-score of 0.9250 was achieved. On large datasets, no significant differences were observed; however, with only 10% of the dataset, the multi-scale framework outperforms the single-scale framework significantly. Moreover, the multi-scale approach requires only half of the dataset and half of the time compared to the best single-scale result to identify the optimal model. In conclusion, the multi-scale framework emerges as an exceptionally efficient solution, capable of delivering superior results with minimal data and resource demands.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.