Evaluating the effectiveness of supervised learning models for antibiotic pollution detection from biochip data

Ruben Ng; Paul Craig

doi:10.1117/12.3017922

8 December 2023 Evaluating the effectiveness of supervised learning models for antibiotic pollution detection from biochip data

Ruben Ng, Paul Craig

Author Affiliations +

Proceedings Volume 12943, International Workshop on Signal Processing and Machine Learning (WSPML 2023); 1294315 (2023) https://doi.org/10.1117/12.3017922
Event: International Workshop on Signal Processing and Machine Learning (WSPML 2023), 2023, Hangzhou, ZJ, China

Abstract

A biochip is an array of biosensor spots arranged on a durable substrate that can be used to detect and differentiate between different biochemical analytes. This paper looks at the effectiveness of different supervised learning models to detect analytes using biochip spot patterns using the case-study of antibiotic pollution detection with models generated by extracting RGB values from a chip with sixteen spots. We evaluate the performance and accuracy of four types of model, Decision Trees, Random Forest, Na¨ıve Bayes and Neural Networks, by analysing metrics such as processing time, Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). Our analysis shows that different models have their different strengths and weaknesses for reading biochip data. Decision trees and Na¨ıve Bayes have the advantage of being explainable, so that biologists can understand which particular spot values lead to a given classification, although they are significantly less accurate that other methods. Random Forrest and Neural Networks have the advantage of high accuracy but act like a black box so the biologists have little clue as to what spot patterns lead to a particular classification or how much a reading relies on a small change in value or a small number of spots. This is important in order to assess the reliability of the chip reading and determine if further tests or required or if subsequent action can be taken and help chip-designers to determine if the design of their chips needed to be improved. We also found that Random Forrest classifiers have significantly better computational performance than Neural Networks, which makes them suitable to be used in interfaces that allow users to re-run the classifications to determine how changes in spot values can change the classification. Ultimately the accuracy and computational performance of Random Forrest classifiers would make them the preferred option (to be used with interfaces than can show and allow us to test different values) for biochips of the type described in this paper.

(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Ruben Ng and Paul Craig "Evaluating the effectiveness of supervised learning models for antibiotic pollution detection from biochip data", Proc. SPIE 12943, International Workshop on Signal Processing and Machine Learning (WSPML 2023), 1294315 (8 December 2023); https://doi.org/10.1117/12.3017922

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available