Paper
10 April 2023 Data-centric optimization method to imbalanced datasets
Vladislav Tanov, Ivan Ivanov
Author Affiliations +
Proceedings Volume 12616, International Conference on Mathematical and Statistical Physics, Computational Science, Education, and Communication (ICMSCE 2022); 1261602 (2023) https://doi.org/10.1117/12.2674455
Event: International Conference on Mathematical and Statistical Physics, Computational Science, Education, and Communication (ICMSCE 2022), 2022, Istanbul, Turkey
Abstract
The imbalanced data sets is an important investigation in the data science and machine learning domain. This paper is focused on providing an effective algorithm for such cases. In these cases, the number of observations is not enough to construct a machine learning model with high values of the estimated magnitudes. For example, most of the observations are labeled as one class (majority class), and the rest as the other, commonly considered as the class of interest (minority class). We propose a methodology based on the properties of the data resampling approach to systematically enhance and optimize the training dataset. The paper examines numerical experiments applying the data centric optimization methodology and compares with previously obtained results by other authors.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vladislav Tanov and Ivan Ivanov "Data-centric optimization method to imbalanced datasets", Proc. SPIE 12616, International Conference on Mathematical and Statistical Physics, Computational Science, Education, and Communication (ICMSCE 2022), 1261602 (10 April 2023); https://doi.org/10.1117/12.2674455
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Feature selection

Mathematical optimization

Education and training

Heart

Machine learning

Random forests

RELATED CONTENT


Back to Top