Adaptation of Network-based Data for Intrusion Detection System
Abstract
The increasing complexity of cyber threats requires robust intrusion detection systems (IDS) that leverage network-based data to detect and reduce malicious activities. With the growth in the size of computer networks and applications being developed, there is also an increase in the threat and damage that can be done as a result of these malicious activities. This research work looks at findings from scholarly journals to enable the adaptation and evaluation of network-based data for IDS. Based on a review of over 50 peer-reviewed research works, key methodologies, datasets (e.g., NSL-KDD, CICIDS2017, UNSW-NB15), and performance metrics (accuracy, precision, recall, F1-score) are analyzed, highlighting the dominance of machine learning and deep learning techniques in IDS development. In addition, research gaps were uncovered, particularly the difficulty in finding comprehensive and valid datasets that can be tested and evaluated for intrusion detection, limited focus on zero-day attacks, scalability in high-speed networks, and a proper explanation of models. A very critical gap is the scarcity of comprehensive, context-aware datasets, particularly for regions where traffic is dominated by mobile technology and localized threats (e.g., phishing, mobile banking fraud). This gap hinders the accurate deployment and evaluation of IDS. To address this, the study proposes generating benign and attack-specific network flows using synthetic data generated in a network environment and evaluating their performance for IDS. Recommendations include developing lightweight preprocessing frameworks and incorporating specific attack patterns to enhance IDS effectiveness.