Introduction

In a progressively interconnected world, where every facet of our lives seems to intertwine with digital technology, the importance of cybersecurity cannot be overstated. Our digital assets from confidential information to vital infrastructure are constantly vulnerable to a broad spectrum of cyber threats, from widespread malware to sophisticated state-sponsored attacks. As we continue to embrace the conveniences of the digital age, it becomes imperative to understand the fundamentals of cybersecurity and adopt proactive measures to safeguard our digital existence.

Cybersecurity incorporates a broad spectrum of procedures, technologies, and policies constructed to protect computer systems, networks, and data from unauthorized access, interference, or devastation1. At its core, cybersecurity aims to mitigate risks and vulnerabilities inherent in the digital landscape, ensuring confidentiality, integrity, and availability of information.

Threat detection and prevention are two pillars of cybersecurity. This involves employing robust antivirus software, firewalls, intrusion detection systems, and other advanced tools to identify and thwart malicious activities in real-time. However, given the evolving nature of cyber threats, proactive measures such as regular software updates, patch management, and user education are equally crucial in maintaining a resilient defense posture.

As organizations increasingly rely on cloud computing services to streamline operations and reduce costs, safeguarding cloud-based assets becomes paramount. Cloud security entails implementing robust authentication mechanisms, encryption protocols, and access controls to protect data stored and processed in remote servers. Additionally, ensuring compliance with industry standards and regulations such as GDPR and HIPAA is essential to avoid costly data breaches and regulatory penalties.

In an era marked by the proliferation of cyber threats, collaboration and information sharing play a pivotal role in strengthening cybersecurity resilience. Public–private partnerships, threat intelligence sharing initiatives, and cybersecurity awareness campaigns foster a collective defence posture, enabling stakeholders to anticipate and mitigate emerging threats effectively.

Cybersecurity is a continuous journey rather than a destination. As technology evolves and threat landscapes evolve, so too must our cybersecurity strategies. By embracing a proactive mindset, leveraging cutting-edge technologies, and fostering collaboration among stakeholders, we can traverse the digital frontier with confidence, ensuring a secure and resilient digital future for generations to come.

The landscape of cyber threats is multifaceted and constantly evolving, encompassing a wide spectrum of tactics and techniques employed by adversaries with diverse motives and capabilities. By remaining vigilant, adopting robust cybersecurity practices, and fostering a culture of security awareness, individuals and organizations can fortify their defences and mitigate the risks posed by cyber-attacks in an increasingly interconnected world.

In the digital realm, where connectivity reigns supreme, the landscape is rife with a diverse array of cyber threats that pose significant risks to personalities, organizations, and nations. Understanding the various types of cyber-attacks is paramount to bolstering defences and mitigating potential vulnerabilities. From common tactics employed by cybercriminals to sophisticated state-sponsored campaigns, each form of cyber-attack presents unique challenges and implications.

Several types of cyber-attacks that can pose a threat are shown in Fig. 1:

  • Phishing: When attackers trying to trick revealing sensitive information like password or credit card details by pretending to be a legitimate entity.

  • Malware: Malicious software that can infect devices and cause harm such as viruses, worms, or ransomware.

  • DDos Attacks: Distributed Denial Services attacks overwhelm a targets network or website with a flood of traffic, causing it to become unavailable.

  • Man-in-the-Middle attack: Hacker’s intercept and modify communication between two parties without their understanding, potentially stealing sensitive information.

  • SQL injection: Attackers exploits vulnerabilities in a website’s database to gain illegal access or manipulate information.

  • Social engineering: This involves manipulating individuals to gain access to confidential information, often through deception or psychological manipulation.

  • Spear Phishing: To maximize the probability of success, attackers tailor their communications to target certain people or organizations in the form of targeted phishing.

  • Insider Threats: When someone within an organisation missus their access privileges to steal restricted information.

  • Zero-day Exploits: The liabilities in software or system that are unspecified to the developers and can be exploited by attackers before a patch or fix is available.

  • Advanced Persistent threats (APTs): They are long term, sophisticated attacks where attackers gain unauthorised approach to a network and rename unobserved for an extended period, often with the purpose of stealing valuable information.

  • Password Attacks: Various approaches like brute force attacks, dictionary attacks and credential stuffing to gain unauthorised access to user accounts by exploiting weak or stolen password.

Figure 1
figure 1

Types of Cyber Attacks.

In an era where cyber threats approach large and organizations face an ever-expanding array of sophisticated criticism, the purpose of intrusion detection systems (IDS) has become increasingly indispensable. An intrusion detection system is an essential part of an organization’s cybersecurity weapons, designed to monitor network traffic, detect suspicious activities, and vigilant security personnel to hypothetical certainty violations in real-time. By providing timely insights into emerging threats and anomalous behaviours, IDS empowers organizations to fortify their defences and mitigate the risks posed by cyber-attacks effectively.

The capacity of intrusion detection systems to reliably distinguish between benign activity and malicious behaviour is a critical component of their success. Some organizations choose to integrate IDS functionality into their broader security infrastructure, utilizing security information and event management (SIEM) systems or unified threat management (UTM) platforms for centralized monitoring and management, others choose to deploy standalone IDS solutions as specialized hardware appliances or software applications. Intrusion detection systems are essential for incident response and forensic investigation in a dition to identifying and warning security staff about possible security breaches.IDS helps organizations to effectively investigate security breaches, analyze attack vectors, and execute remediation procedures to prevent future occurrences by providing extensive logs, alerts, and forensic data linked to security incident.

In recent years IDS has become progressively area of research. The goal of Intrusion detection systems is towards preserve the network or system from unauthorized approaches. Machine learning plays a crucial part in intrusion detection systems and cyber security. By analysing large volumes of data and detecting patterns, machine learning algorithms can detect abnormal behaviour and potential threats in real-time. This helps in identifying and preventing sensitive information and enhancing overall cybersecurity measures. It’s like having an extra eye and a powerful brain to keep digital world safe.

IDS can be classified as Network- based intrusion detection system (NIDS) and host-based intrusion detection system (HIDS)2. NIDS is effective at detecting threats that occur at the network level while HIDS provides more visibility into individuals hosts. NIDS analyses network traffic in real-time, looking for patterns and signatures of known attacks. They can detect threats like port scanning, denial of service attacks, and suspicious network behaviours. OSSEC is an open source HIDS which monitors individuals hosts or devices within a network, or unusual behaviour on specific machines. It can detect threats such as unauthorised access, file system modifications, suspicious processes running on a host.

IDS uses various techniques to detect intrusions. Some common techniques include:

  1. 1.

    Signature-based detection: IDS compare network traffic or system activities against a database of known attack signatures to identify malicious patterns.

  2. 2.

    Anomaly-based detection: IDS establish a standard of normal behaviour and flag any deviations from this baseline as potential intrusions.

  3. 3.

    Heuristic-based detection: IDS analyse network or system data to detect unusual patterns or trends that may indicate an intrusion.

Deep neural networks have been used in combination with hybrid optimization frameworks and nature-inspired algorithms to decrease execution times and increase accuracy3. The succeeding made it possible for energy optimization and improved prediction results.

To enhance the detection accuracy, reduce false positives, and improve the inclusive performance of the existing IDS, deep learning models such as artificial neural networks, convolutional neural network, recurrent neural network, etc. are used to optimize the layers of neural networks using nature inspired algorithms because of their broad capacity to detect both normal and malicious attacks.

In this research work Spider Monkey Optimization (SMO) is executed to detect intrusions and reduce false positive alarm using Artificial Neural Network. ANN is inspired by human brain, can learn, and make predictions based on input data. ANNs have witnessed notable triumph across various domains, from image and speech recognition to natural language processing and financial forecasting4. An ANN comprises consistent nodes organized into three layers which is input layer, hidden layers, and output layer. Each node, or neuron, receives input signals, processes them using activation functions, and produces an output signal.

The principal objective of this paper is to identify attacks and optimize the layers of interconnected neural networks using SMO. SMO is a population-based algorithm, Inspired with the social communications of spider monkeys. It is depended on the ingenious foraging habits of spider monkeys, which simulate the social structure of fission–fusion5. Individuals in FFSS temporarily create tiny groups whose members are part of a more stable or bigger community. Based on the scarcity and accessibility of the foods, monkeys split themselves into larger and smaller groups.6.

Related work

Parvathi Pothumani7 ensemble weighted voting classifier-based honeypot framework using network intrusion detection for the IOT security. Methods involves combining the results of various IDS based on decision tree, random forest and XGBoost using a weighted approach. The main goal of this paper is to give more weight to the predictions of IDS that have higher accuracy and lower false alarm rates. The suggested method can be used to reduce the impact of false alarms and increase the sensitivity and specificity of the IDS process.

Ch. Kodanda Ramu proposed an innovative meta-heuristic optimization and deep learning-based methodology for improving the performance of NIDS. Raw traffic data is captured during the initial phase to attain data standardization and data balancing then extended Pelican Optimization algorithm (EX-Pel) is employed to select the group of features. Finally Self-Attention Assisted Weighted Auto Encoder (SAttn_WAE) is implemented to detect the attacks. The proposed method achieved an accuracy of 99.23% which is comparatively better than the existing methods8.

Jayalatchumy9 interprets a unique intrusion system whose main objective is to increase the effectiveness of network intrusion detection. The enhanced Crow Search algorithm is used to determine the most significant features that help in more accurately categorizing intrusion attacks. In the last stage, the chosen features are fed into the ensemble classifier which classifies the invader and standard labels. NSL-KDD and UNSW-NB15 dataset is manipulated for the experimental outcome.

Vinayakumar10 investigate a deep neural network (DNN) to develop an adaptable and effective IDS to detect and classify unforeseen and unpredictable cyberattacks. To execute the DNN model, KDD-CUP99 dataset is applied to different datasets including NSL-KDD, UNSW-NB15, KYOTO, WSN-DS and CICIDS 2017. After experimenting with various machine learning classifiers10 proposed the scale-hybrid-IDS-AlertNet, a highly scalable and hybrid DNNs framework that can be utilized in real-time to efficiently monitor network traffic and host-level events in order to pre-emptively notify potential threats.

In the study of machine learning algorithms for intrusion detection systems in computer network, Serkan Keskin compares and analyse ML methods to create anomaly-based intrusion detection that can detect and identify network attacks with a highest accuracy11. CSE-CIC-IDS2018 dataset is using Decision Trees, Random Forest, Extra Trees, and Extreme Gradient Boosting machine learning techniques to get high accuracy.

Fatima Alwahedi12 come up with the comprehensive overview of the most recent developments, approaches, and difficulties in using machine learning to detect cyberthreats in Internet of Things environments. For future work, generative AI and massive language models is applied to improve IOT security.

Mohammad Sazzadul Hoque13 applied genetic algorithm to detect numerous types of network intrusions. KDD99 dataset is used in the suggested algorithm.

To detect threats in an IOT environment, Ethala Sandhya14 presented Spider Monkey Optimisation using Random Forest (SMO-RF). The NSL-KDD dataset was pre-processed. When compared to the current SVM parameter optimisation, SMO-RF yielded the best accuracy.

Vijayalakshmi15 presented, an integrated deep learning approach is developed to detect files that contain malware and software that were obtained illegally via the Internet of Things network. Hybrid Dual-Channel Convolution Neural Network (DCCNN-SMO) with Spider Monkey Optimization, is a hybrid neural network that uses stolen reference code to identify software piracy.

The author has developed a Hybrid model to detect anomalies in cloud by combining grey wolf optimization (GWO) and convolutional neural networks (CNN) in16.

Daniel Kwegyir suggests an optimal trainer for feedforward neural networks (FNNs) using spider monkey optimization (SMO) algorithms17.

Akhand developed an algorithm for Travelling Salesman Problem (TSP) using Spider monkey optimization which is named as discrete SMO (DSMO)18. Every spider monkey in DSMO represents a TSP solution when Swap Sequence (SS) and Swap Operator (SO) based operations are used. This allows the monkeys to communicate with one another to find the best TSP solution.

Ajay19 aims to investigate the selection of cluster heads for biomedical applications utilizing traditional SMO and sampling-based SMO. The measurements acquired from particle swarm optimization clustering protocol (PSO-C), low-energy adaptive clustering hierarchy centralized (LEACH-C), and SSMO enhanced routing protocol are compared to those obtained using analogous approaches in homogeneous and heterogeneous situations. SSMO boosts network stability periods and durability by an estimated 12.22%, 6.92%, 32.652%, and 1.22% in these implementations.

Sayar Singh Shekhawat20 proposes a mechanism for extracting the sentiments from the tweets posted on Twitter. To find the best cluster heads for the dataset, a hybrid method known as Hybrid Spider Monkey optimization using k-means clustering is presented using two datasets, namely, sender2 and twitter. A comparison study is conducted using a few significant nature-inspired algorithms, including Particle Swarm, Spider-Monkey, Genetic, and Diferential Evolution, to assess the validity of the suggested approach.

In 2022 Hybrid Mayfly Apriori-Intrusion Detection mechanism is proposed by Sarbani Dasgupta21 to detect the intrusion in big data applications. An apriori rule based on a frequently occurring itemset formed by processing the network data. The rare transactions or itemset are flagged as intrusions. Analysed the effectiveness of the proposed mechanism is done by comparing with various well-known deep learning algorithms.

Recent research highlights the emergence of dummy data injection attacks (DDIA) in power systems, posing significant detection challenges due to their minimal spatial separation from legitimate data. A novel approach outlined in recent studies introduces a mathematical model that incorporates incomplete topological information and alternating current (AC) state estimations. This model leverages temporal and spatial attention matrices to enhance the detection and localization of DDIAs, employing spatio-temporal graph neural networks to adapt to changing system topologies. This methodology, which significantly improves accuracy and computational efficiency, provides valuable insights for our SMO-ANN model’s development in addressing complex cybersecurity threats within network environments22.

Recent advancements in cybersecurity for electric power CPS have addressed challenges in quantifying cyber risk thresholds due to the non-uniformity and dynamic nature of these networks. A novel method detailed in recent studies utilizes percolation theory to quantitatively evaluate the risk propagation threshold in power CPS networks. This method models the network as a dual-layered, directed, unweighted graph, reflecting topology correlation and coupling logic using an asymmetrical balls-into-bins allocation. By integrating the directionality of links between the cyber and physical layers and introducing percolation flow probabilities, this approach develops dynamic equations for layer-specific coupling relationships. The effectiveness of this quantitative evaluation is demonstrated through applications to the IEEE 30-bus system and a 150-node Barabasi-Albert model, providing a robust framework that enhances our understanding of cyber risk management in power systems23.

Methodology

The methodology of the present work comprises of the following steps: -

Data collection

Binary and multiclass datasets are the two types of Cyber-attack dataset. NSL-KDD, CICI-IDS-2017, LuFlow and UNR-IDD datasets have been discussed in this paper. Each datasets contains normal or a malicious attack. The above Fig. 2 shows the IDS datasets which is a labelled and unlabelled dataset.

Figure 2
figure 2

Classification of binary and multiclass dataset.

Description of binary datasets

CIC-IDS 2017

The Cybersecurity Innovation for Cyber-Physical Systems Intrusion Detection System (CIC-IDS) 2017 is a significant milestone in the realm of cybersecurity research and development. Developed as part of the Canadian Institute for Cybersecurity’s efforts, CIC-IDS 2017 represents a cutting-edge intrusion detection system tailored specifically for cyber-physical systems24. This innovative solution integrates advanced machine learning algorithms, anomaly detection techniques, and real-time data analysis capabilities to detect and mitigate cyber threats targeting critical infrastructure, industrial control systems, and interconnected IoT devices25. With its focus on enhancing the resilience and security of cyber-physical systems, CIC-IDS 2017 underscores the importance of proactive defence measures in safeguarding against evolving cyber threats in today’s interconnected world.

Luflow 20

The LuFlow Cyber Security Dataset is an inclusive corpus of network traffic data meticulously curated for cybersecurity research and analysis. Developed by experts in the field of cybersecurity and data science, the LuFlow dataset encompasses a diverse range of network traffic patterns, including benign activities, common cyber-attacks, and advanced persistent threats (APTs). With its extensive coverage of network protocols, traffic types, and attack scenarios, the LuFlow dataset assists as a valuable source for estimating intrusion detection systems, developing machine learning models, and conducting empirical studies on cyber threat detection and mitigation strategies. By providing researchers, practitioners, and organizations with access to real-world network traffic data, the LuFlow Cyber Security Dataset facilitates collaborative research efforts, fosters innovation, and advances the state-of-the-art in cybersecurity defence mechanisms.

Description of multiclass datasets

NSL-KDD dataset

This dataset is the upgraded version of KDD-cup99 dataset which is a very famous dataset. This dataset is frequently used for gauging intrusion detection system in the field of cyber security. NSL-KDD dataset consist of training and test data which is a labelled as either normal or a specific type of attacks. It includes various types of attacks that are commonly encountered in computer network.

UNR-IDD

The University of Nevada, Reno (UNR) Intrusion Detection Datasets (IDD) are valuable resources for cybersecurity researchers and practitioners. These datasets provide a comprehensive collection of network traffic data, simplifying the development and estimation of intrusion detection systems (IDS) also cybersecurity algorithms. The UNR IDD Cyber Attack Dataset comprises network traffic data collected from a pretended network circumstance.

The UNR IDD Cyber Attack Dataset is organized into distinct sections according to the nature of the attack and the attributes of the network traffic. Each category contains multiple datasets, with varying degrees of complexity and severity. The UNR-IDD dataset includes various cyber-attacks in the following components in Table 1. Each attack scenario is meticulously crafted to mimic real-world cyber threats.

Table 1 Characteristics of attack type and network traffic.

Data preprocessing

The collected datasets were first pre-processed and cleaned for any redundancy. Figures 3 and 4 classifies the steps of data-processing and model training.

Figure 3
figure 3

Steps for Data Pre-processing.

Figure 4
figure 4

Steps for model training.

Spider monkey optimization (SMO)

SMO are nature inspired algorithms that mimic the behaviour of spider monkeys to optimize solutions. The foraging behaviour of FFSS of spider monkeys is divided into four steps in the proposed method. The group begins by searching for food and measures how far away they are from the food. Group members reposition themselves and assess the distance from the food sources in the second step, which is dependent on the distance from the foods. In addition, the local leader updates the group’s optimal position in the third phase. If the position is not updated after a predetermined number of times, all the group members start hunting for food in multiple directions. The global leader subsequently modifies its ever-best position in the fourth phase and, in an occurrence of stagnation, splits the group into smaller subgroups26. In27 SMO algorithm comprises of following seven steps which are discussed below:

Initialization of the population

In the proposed work, SMO is allocated with the group of 5 monkeys at different locations and the population size P = 50, where \({SM}_{p}\) (p = 1,2,3……. p) and \({SM}_{p}\) is identified as the population for the \({p}^{th}\) monkey26. Each \({SM}_{p}\) is initialized as follow in Eq. (1):

$${SM}_{pq}= {SM}_{minq}+UR \left(0, 1\right)* \left({SM}_{maxq}- {SM}_{minq}\right)$$
(1)

where \({SM}_{maxq}\) and \({SM}_{minq}\) are the upper and lower bound of the search space in the qth dimension and UR is the homogeneously allocated random number in the range (0, 1).

Local leader phase (LLP)

This is the flexible segment of SMO algorithm. In this segment all the spider monkey gets a change to update their positions. The expertise of the local group leader and members determines the modifications made to the monkeys’ present position. Each spider monkey’s fitness value is calculated in its new ___location, if it is greater than that of its previous ___location then it’s get updated otherwise not. Below is the Eq. (2) of position update:

$${SM}_{newpq}+UR\left(\text{0,1}\right)*\left({LL}_{kq}-{SM}_{pq}\right)+\text{UR}(-\text{1,1})*({SM}_{rq}-{SM}_{pq})$$
(2)

where \({LL}_{kq}\) is the qth dimension from the randomly chosen kth SM of a kth local group having their leader position.

Global leader phase (GLP)

The GLP is initialized once after the LLP is evaluated. The ___location of SM is calculated using the Eq. (3) and the ___location of SM is updated is calculated as follows:

$${SM}_{newpq}={SM}_{pq}+UR\left(\text{0,1}\right)*\left({GL}_{kq}+{SM}_{pq}\right)+UR\left(-1,1\right)*({SM}_{rq}-{SM}_{pq})$$
(3)

where \({GL}_{kq}\) is referred as the global leader ___location having qth dimention (q = 1,2,3….M) having the arbitrarily selected index.

Global leader learning phase (GLL)

The algorithm regulates the optimal solution for the entire swarm at this segment. The recognized SM is regarded as the swarm’s global leader. Additionally, the global leader’s position is verified and if it is not updated then the counter associated with the global leader named as global limit count (GLC) is upgraded by 1, otherwise set to 0.

GLC is verified by Global leader and then cross-referenced with the Global Leader limit.

Local leader learning phase (LLL)

The greedy selection model is used for a local group to update the LLL with the SM ___location based on the fitness value of that precise local group. The optimal ___location is to have a value allocated to the local leader. Increase it to 1 if there is no update present. This will be further additional to the LLC.

Local leader decision phase (LLD)

If the local leader is not updated for their position amongst the fixed LLL then the local groups present are having the candidates altered with the random ___location from the step1 that exploits the information from the global leader and local leader based on pursuing Eq. (4).

$${SM}_{newpq}+{SM}_{pq}+UR\left(\text{0,1}\right)*\left({GL}_{kq}-{SM}_{pq}\right)+UR\left(\text{0,1}\right)*({SM}_{rq}-{LL}_{pq})$$
(4)

Global leader decision phase (GLD)

It is analogous to Local Leader Decision phase, If the global leader is not reorganized to certain extents known as GLL, then the global leader splits the swarm into small groups. Equation (5) shows the highest value of the fitness function.

$${fitnesss}_{features \,importance}=\frac{Number \,of \,samples \,that \,reaches \,the \,nodes}{Total \,number \,of \,samples}$$
(5)

Control parameters in SMO

The SMO has primarily four control parameters. The highest number of groups (MG), the global leader limit, the local leader limit, and the perturbation rate (pr)28. The recommended parameter settings and the equation for MG are provided below in Eq. (6).

$$MG=\frac{N}{10}$$
(6)

Exploration and exploitation are the foremost part of any nature-inspired optimization algorithms. Therefore, it is shown that if a population-based algorithm is capable of balancing between exploration and exploitation of the search space, then the algorithm is regarded as an efficient algorithm. In the above steps, GLL and LLL phase are involved to analyse the search operations during deadlock periods. In such cases both the leaders make decision, and an advance exploration is made by the LL in the decision phase, a fission or fusion decision is taken29. Thus, the search speed is well stable in the classification using SMO approach by the fore mentioned step being accomplished.

Proposed SMO-ANN

In our research work, with the implementation of spider monkey optimization and artificial neural network, SMO-ANN is presented to determine network traffic and attack type. Below is the workflow diagram in the Fig. 5.

Figure 5
figure 5

workflow diagram of SMO-ANN.

Result and discussion

System specification

The suggested model was run on a system with the following specifications:

  • Operating system: Windows 10

  • RAM:83 GB

  • GPU: 40 GB (A100)

  • Processor:12th Gen Intel(R)core (TM) i5-1235U

Performance evaluation

Evaluation will be done from binary datasets which is CIC-IDS-2017 and Luflow datasets and from multiclass datasets i.e. NSL-KDD and UNR-IDD datasets.

Binary class dataset

This section provides In-depth evaluation of LU-flow and CIC-IDS 2017 datasets. It reveals high effectiveness in distinguishing the network traffic which will be evaluated by precision, recall, F1 score, confusion matrix and ROC.

Luflow dataset

An outstanding capacity of the model to discriminate between ‘Normal’ and ‘Anomaly’ network traffic is demonstrated by its binary classification performance on the Lu-Flow dataset. The dataset was essential in assessing the prediction robustness of our model because it is well-known for its thorough portrayal of contemporary network risks.

Table 2 represents the results of our analysis on the Lu-Flow, which includes 1,10,58,417 instances. The precision for ‘Normal’ activity is 1.00, while for ‘Anomaly’ it is 1.00. The recalls are also 1.00 and 0.99 respectively. For both classes, these numbers add up to an F1-score of 1.00, which indicates 100% overall accuracy. This performance demonstrates the robustness of the model. The model’s robustness and strategic learning algorithm, which are designed to handle the intricacies and subtleties of network traffic, are supported by its overall accuracy, which in this context is determined to be near to 100%. The ‘Normal’ and ‘Anomaly’ classes have extremely high precision and recall, both metrics above 1.00, indicating the model's reliability and its potential as effective tool for cybersecurity threat identification.

Table 2 Performance result of Binary Classification on Lu-Flow Dataset.

The model’s classification accuracy is described in depth in the following confusion matrix, which is shown in Fig. 6. Remarkably, with 1,361,870 true positives and only 1409 false negatives, the model identified “Normal” behaviour with almost perfect precision. ‘Anomaly’ class indicate that there were 510,070 true positives and 6252 false positives.

Figure 6
figure 6

Confusion Matrix for Lu-Flow dataset.

The AUC achieves near to perfect score of 0.99, and the ROC curve shows as shown in Fig. 7 shows a superlative true positive rate. This accomplishment exhibits the exceptional sensitivity and specificity of the model, which accurately detects almost all positive cases with a very low false positive rate.

Figure 7
figure 7

ROC Curve Evaluation discriminating Thresholds in Luflow.

CICIDS 2017

The model’s exceptional detection skills are demonstrated by the evaluation of the binary classification model on the CICIDS 2017 dataset, which is specifically designed for the evaluation of anomaly-based IDS. Our model's detection ability was put to the test on the CICIDS dataset, which focuses on simulating contemporary low-footprint attack techniques.

Table 3 displays the results obtained from the CICIDS 2017 dataset, which includes 2,016,647 instances. The model retained a high recall rate of 0.98 and 0.93, respectively, with an accuracy of 0.99 for “normal” traffic detection and 0.91 for “anomaly” traffic. F1-scores of 0.98 and 0.92 for “normal” and “anomaly,” respectively, were obtained from this, with an accuracy of 97% overall. Although ‘anomaly’ had a lower accuracy recall, the model successfully identified all ‘typical’ behaviours.

Table 3 Performance result for Binary Classification of CICIDS 2017 dataset.

The model's performance is further supported by the confusion matrix, which is displayed in Fig. 8. Of the 1,675,825 cases that were labelled as Normal, just 23,617 were incorrectly labelled as anomaly, which translates to a very high recall rate and precision. In a similar vein, 317,205 occurrences of Anomaly were accurately predicted by the model whereas just 30,635 false positives were recorded. The model's ability to function as a dependable tool for cybersecurity threat identification was further supported by the astounding overall accuracy rate that resulted from this.

Figure 8
figure 8

Confusion Matrix for CICIDS 2017 Dataset.

The performance of the model is shown by the ROC curve, which has an AUC of 0.96 in Fig. 9. With an ideal balance between genuine positive and false positive rates, this result shows an almost perfect separation between the positive and negative classes.

Figure 9
figure 9

ROC Curve Evaluation discriminating Thresholds in CICIDS 2017.

Multi-class dataset results

UNR-IDD dataset

After a thorough examination of the UNR-IDD dataset, our multiclassification model has proven to be rather predictive, as indicated by the wide range of metrics that we were able to acquire. Our thorough analysis of the ROC curves and confusion matrix has given us profound insights into the capabilities of the model.

Table 4 shows that the UNR-IDD, a benchmark dataset for network intrusion detection systems, was used to assess the suggested classification model. An overall classification accuracy of 92% across 5 different assault types, the model demonstrates outstanding classification accuracy. The precision and recall for attack types like “blackhole” (precision: 0.97, recall: 0.95), “Diversion” (precision: 0.98, recall: 0.97), and “Overflow” (precision: 0.90, recall: 0.95) are particularly noteworthy metrics. The combined performance of Precision, recall and F1-score, with the accuracy of 0.92, shows the model can identify both normal and malicious intrusion attempts.

Table 4 Performance result for Multi Classification for UNR-IDD dataset.

The confusion matrix further demonstrates the model’s proficiency shown in Fig. 10. The 'BENIGN' class, with an all of instances correctly classified (7575 out of 7575), and the 'Diversion' class, with an almost perfect recognition rate (7369 out of 7575), exemplify the model's high sensitivity and specificity. Misclassifications were minimal, as shown by the off-diagonal elements in the confusion matrix, which indicates a strong discriminative power even amongst similar attack categories. The ROC curve is shown below in Fig. 11.

Figure 10
figure 10

Comprehensive Confusion Matrix Analysis of UNR-IDD(Multiclass) Performance.

Figure 11
figure 11

ROC Curve Evaluation discriminating Thresholds in UNR-IDD.

NSL-KDD dataset

After a thorough examination of the NSL-KDD dataset, our multiclassification model has proven to be rather predictive, as indicated by the wide range of metrics that we were able to acquire. Our thorough analysis of the ROC curves and confusion matrix has given us profound insights into the capabilities of the model.

Table 5 shows that the suggested classification model was tested using the network intrusion detection system benchmark dataset, NSL-KDD. With an overall classification accuracy of 99% over 4 different assault types and 309,778 cases, the model demonstrated great classification accuracy 'DOS' (precision: 1.00, recall: 1.00), 'Probe' (precision: 0.99, recall: 0.99), and 'R2L' (precision: 0.99, recall: 0.99) are among the attack types with notable precision and recall. In conclusion, our model achieved a remarkable 99% accuracy, demonstrating its exceptional ability to recognize normal and all assault categories.

Table 5 Performance result for Multi Classification for NSL-KDD dataset.

Evaluation of the model's performance through the confusion matrix (Fig. 12) reveals a highly encouraging distribution of predictions. Notably, the model demonstrates exceptional discernment across a wide range of intrusion detection classes, achieving near-perfect precision and recall for complex attack types such as “DOS”, “Probe” and “R2L”. The detailed analysis provided by the confusion matrix sheds light on the model's strengths and weaknesses, offering a roadmap for potential improvement.

Figure 12
figure 12

Comprehensive Confusion Matrix Analysis of NSL-KDD(Multiclass) Performance.

The ROC curves for most attack categories, particularly "DOS," "Probe," and "Normal," demonstrate exceptional performance. These curves accomplish an Area Under the Curve (AUC) of 1.00, signifying perfect classification. This translates to a complete separation between the TPR and FPR, meaning the model flawlessly distinguishes these attack types from normal network activity with no false positives. The model's proficiency extends beyond frequently encountered attacks. The ROC curve for "U2R" attacks, though not achieving a perfect 1.00 AUC, still exhibits a very high score (potentially close to 1.00 based on the Fig. 13). This indicates excellent discriminative power in identifying these less frequent attacks.

Figure 13
figure 13

ROC Curve Evaluation discriminating Thresholds in NSL-KDD.

State of art

The field of intrusion detection has seen considerable advancements through the integration of various machine learning and optimization techniques. This section compares our Spider Monkey Optimization and Artificial Neural Network (SMO-ANN) approach with existing methodologies, showcasing the enhancements our model brings to the detection capabilities of IDS.

Table 6, presents a comparative analysis of different IDS approaches, emphasizing the performance metrics of accuracy, precision, and F-measure. These metrics are critical for assessing the effectiveness of intrusion detection systems in correctly identifying security threats with minimal errors.

Table 6 State of art comparison of Different approaches for IDS.

Conclusion and future scope

Intrusion detection systems (IDS) play an increasingly vital role in safeguarding digital infrastructures against evolving cyber threats. The effectiveness of these systems hinges on their ability to accurately detect anomalies with minimal false alarms and efficient execution times. To address these challenges, this study introduced the SMO-ANN model, which integrates the robust capabilities of Artificial Neural Networks (ANN) with the adaptive, nature-inspired Spider Monkey Optimization (SMO) algorithm. This hybrid approach has demonstrated remarkable proficiency in optimizing neural network layers, significantly enhancing anomaly detection capabilities as evidenced by rigorous evaluations using precision, recall, F1-score, confusion matrix, ROC curve, and accuracy metrics.

Our findings indicate that the SMO-ANN model achieves exceptional performance, with the Luflow binary dataset achieving an accuracy of 100%, and the NSL-KDD multiclass dataset reaching 99% accuracy. These results underscore the model's potential in handling diverse and complex IDS scenarios effectively.

Despite the promising outcomes, the application of the SMO-ANN model across various intrusion detection scenarios highlights the need for continuous enhancements to address the nuances of real-world data dynamics. Future research will focus on several key areas to further refine and validate the model’s capabilities:

  • Addressing Class Imbalance: One significant challenge in IDS is the class imbalance prevalent in real-world datasets, where rare attack types are underrepresented. Future iterations of the model will explore advanced sampling techniques such as Synthetic Minority Over-sampling Technique (SMOTE) and adaptive resampling to ensure that the model can effectively learn from underrepresented classes without overfitting33.

  • Enhancing Privacy Measures: As privacy concerns become more prominent, particularly with the increasing integration of IDS in sensitive sectors, incorporating privacy-preserving techniques into the SMO-ANN model will be essential. Approaches such as differential privacy and federated learning will be investigated to allow the model to benefit from decentralized data sources while safeguarding user privacy34.

  • Hybrid Deep Learning Models: Exploring hybrid models that combine SMO-ANN with other deep learning architectures could potentially unlock new levels of detection capabilities and efficiencies, tailored to specific types of cyber threats and intrusion scenarios.

By expanding the scope of the SMO-ANN model to incorporate these enhancements, we aim to develop a more adaptive, efficient, and privacy-conscious intrusion detection system. The anticipated advancements will not only improve the model’s accuracy and applicability but also contribute to the broader field of cybersecurity, ensuring robust defences against the cyber threats of tomorrow.