Modern computer and communication infrastructures are highly vulnerable to malicious codes and activities. There are many different ways malicious codes such as viruses, worms, Trojan horses etc. can damage a multitude of services, computers, financial structures, cyber infrastructure and data privacy. Signature based detection are more prevalent in preventing these types of attacks than machine learning detection. Anti-virus vendors are facing huge quantities (thousands) of suspicious files every day. These files are collected from various sources including dedicated honeypots, third party providers and files reported by customers either automatically or explicitly. The large number of files makes efficient and effective inspection of codes particularly challenging. In this paper, we propose a two part hybrid detection system that is in two parts. One part is a misuse detection system and the second part is an anomaly detection system. Misuse dependent detection is based on a random forest classifier and anomaly based detection is based on a single class SVM with bagging technique. We depart from the usual approach by using Correlation Feature Selection algorithm (CFS) for feature selection. Our experiment shows that our hybrid detection system outperforms the existing hybrid systems with other machine learning algorithms.
|