Classification of imbalanced bioassay data with features learned using stacked autoencoder

Jeni Shah; Manjunath Joshi

doi:10.1117/12.2679627

7 June 2023 Classification of imbalanced bioassay data with features learned using stacked autoencoder

Jeni Shah, Manjunath Joshi

Proceedings Volume 12701, Fifteenth International Conference on Machine Vision (ICMV 2022); 127011P (2023) https://doi.org/10.1117/12.2679627
Event: Fifteenth International Conference on Machine Vision (ICMV 2022), 2022, Rome, Italy

Abstract

Bioassay data classification is an important task in drug discovery. However, the data used in classification is highly imbalanced, leading to inaccuracies in classification for the minority class. We propose a novel approach for classification in which we train separate models by using different features that are derived by training stacked autoencoders (SAE). Experiments are performed on 7 bioassay datasets, in which each data file consists of feature descriptors for every compound along with class label of compound being active, or inactive. We first perform data cleaning using borderline synthetic minority oversampling technique (SMOTE) followed by removing the Tomek links, and then learn different features hierarchically, based on the cleaned data or feature vectors. We then train separate cost-sensitive feed-forward neural network (FNN) classifiers using the hierarchical features in order to obtain the final classification. To increase the True Positive Rate (TPR), a test sample is labeled as active if at least one classifier predicts it as active. In this paper, we demonstrate that by data cleaning and learning separate classifiers one can improve the TPR and F1 score when compared to other machine learning approaches. To the best of our knowledge, the researchers have not yet attempted the use of SAE and FNN for classifying bioassay data.

Citation Download Citation

Jeni Shah and Manjunath Joshi "Classification of imbalanced bioassay data with features learned using stacked autoencoder", Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022), 127011P (7 June 2023); https://doi.org/10.1117/12.2679627

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available