Paper
7 June 2023 Classification of imbalanced bioassay data with features learned using stacked autoencoder
Jeni Shah, Manjunath Joshi
Author Affiliations +
Proceedings Volume 12701, Fifteenth International Conference on Machine Vision (ICMV 2022); 127011P (2023) https://doi.org/10.1117/12.2679627
Event: Fifteenth International Conference on Machine Vision (ICMV 2022), 2022, Rome, Italy
Abstract
Bioassay data classification is an important task in drug discovery. However, the data used in classification is highly imbalanced, leading to inaccuracies in classification for the minority class. We propose a novel approach for classification in which we train separate models by using different features that are derived by training stacked autoencoders (SAE). Experiments are performed on 7 bioassay datasets, in which each data file consists of feature descriptors for every compound along with class label of compound being active, or inactive. We first perform data cleaning using borderline synthetic minority oversampling technique (SMOTE) followed by removing the Tomek links, and then learn different features hierarchically, based on the cleaned data or feature vectors. We then train separate cost-sensitive feed-forward neural network (FNN) classifiers using the hierarchical features in order to obtain the final classification. To increase the True Positive Rate (TPR), a test sample is labeled as active if at least one classifier predicts it as active. In this paper, we demonstrate that by data cleaning and learning separate classifiers one can improve the TPR and F1 score when compared to other machine learning approaches. To the best of our knowledge, the researchers have not yet attempted the use of SAE and FNN for classifying bioassay data.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jeni Shah and Manjunath Joshi "Classification of imbalanced bioassay data with features learned using stacked autoencoder", Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022), 127011P (7 June 2023); https://doi.org/10.1117/12.2679627
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Feature extraction

Deep learning

Drug discovery

Neural networks

Back to Top