Creator:
Date:
Abstract:
Machine learning algorithms are known to help identify cyberattacks such as network intrusion. However, common network intrusion datasets are often imbalanced. We conduct a detailed analysis on the impact of the different resampling techniques over different machine learning classifiers. We include more advanced resampling techniques, such as CGAN oversampling, in our study and compare its performance against other oversampling techniques. To further investigate CGAN-based oversampling potential, we examine the effect of CGAN with other standard machine learning classifiers on two different datasets. We do not recommend using CGAN in a dataset with extremely low samples in its minority classes based on our experimental results. Consequently, we also investigate the choice of minority class(es) to be oversampled in a dataset with low minority samples. Finally, the impact of the number of synthetic samples to be generated on the detection rate is evaluated on two different network intrusion datasets.