Presentation + Paper
7 June 2024 Semisupervised learning with data augmentation for raw network traffic detection
Robin C. Bhoo, Nathaniel D. Bastian
Author Affiliations +
Abstract
Deep learning (DL) has revolutionized machine learning tasks in various domains, but conventional DL methods often demand substantial amounts of labeled data. Semi-supervised learning (SSL) provides an effective solution by incorporating unlabeled data, offering significant advantages in terms of cost and data accessibility. While DL has shown promise with its integration as a component of modern network intrusion detection systems (NIDS), the majority of research in this field focuses on fully supervised learning. However, more recent SSL algorithms leveraging data augmentations do not perform optimally “out of the box” due to the absence of suitable augmentation schemes for packet-level network traffic data. Through the introduction of a novel data augmentation scheme tailored to packet-level network traffic datasets, this paper presents a comprehensive analysis of multiple SSL algorithms for multi-class network traffic detection in a few-shot learning scenario. We find that even relatively simple approaches like vanilla pseudo-labeling can achieve an F1-Score that is within 5% of fully supervised learning methods while utilizing less than 2% of the labeled data.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Robin C. Bhoo and Nathaniel D. Bastian "Semisupervised learning with data augmentation for raw network traffic detection", Proc. SPIE 13051, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications VI, 130511E (7 June 2024); https://doi.org/10.1117/12.3013183
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Machine learning

Data modeling

Deep learning

Image classification

Computer intrusion detection

Network security

Back to Top