Cyber Security is one of the best applications of artificial intelligence. It has been used not only to build more secure systems, but also to breach secure systems. There is an estimate that more than 3 trillion dollars were lost due to cyber attacks or other internet fraud. This already astounding number is speculated to go up to 6 trillion dollars by end of this decade.
One of the most common attacks is the distributed denial of service (DDoS) attack. The DDoS attack is an attempt to traffic on a target website or server by flooding it by many requests.
There is a special family of cyber attacks known as adversarial attacks. There are almost no systems that are able to face these. Researchers from the Shanghai Jiao Tong University in China have developed a framework of the generative adversarial networks called IDSGAN. This framework is proposed to produce adversarial attacks, which can fool and evade the intrusion detection algorithm or system.
Idea Behind IDSGAN
In a recent paper, researchers have developed a system to deceive systems assuming nothing is known to attackers about the system. IDSGAN works on the concept of a generator to change original bad traffic data. The concept of discriminator classifies traffic data and simulates a completely black box detection system. Their experiments are based on the dataset known as NSL-KDD, the researchers attack many systems and showcase very good results.
In some of the last decades there has been amazing work done in the field of generative training. Goodfellow introduced the GAN (generative adversarial networks) a very popular framework. His main idea is that two neural networks, generator and discriminator play a game to find an optimal solution. The researchers say that there is not much research done in security domains by applying GANs.
The Main Contribution Of The Researchers Are:
- The improvement to IDSGAN is the main contribution which is an improved framework upon the GAN against a cyber defence system. It produces malicious and bad traffic to attack the intrusion detection system (IDS).
- To build and mimic attacks and IDS in the real world.
- Adversarial attacks go on to perform black box attacks.
- IDSGAN shows impressive performance in experiments. The detection rates to the artificially produced have also reached 0 which is very impressive.
- The researchers also test and discuss the influence on IDSGAN various conditions. The changes in condition also demonstrate the robustness of the system.
Structure Of IDSGAN And Dataset
The dataset of NSL-KDD is used as a benchmark dataset to check the performance of any intrusion detection system. In NSL-KDD dataset, there are training set known as KDDTrain+ and the testing set known as KDDTest+. There are multiple types of malicious and normal traffic data in the dataset. The malicious categories are Probing (Probe), Denial of Service (DoS), User to Root (U2R) and Root to Local (R2L). The dataset consists of 9 features in discrete values and 32 features in continuous values, making it a total of 41 features. In abstract there are four sets of feature types:
- “Intrinsic” features try to reflect the inherent characteristics of a single connection for the general network analysis.
- “Content” features are used to mark the content of connections which indicate whether some behaviours related to the attack exist in the traffic.
- “Time-based traffic” features do the working of examining the connections in the past 2 seconds, which have the same destination host or the same service as the current connection, including the “same host” features and “same service” features.
- “Host-based traffic” features try to monitor the connections in the past 100 connections, which have the same destination host or the same service as the current connection, as the mirror of “time-based traffic” features.
The researchers take into account the many types of GANs. The researchers try to prevent the non-convergence and the famous instability of GAN. The researchers use the Wasserstein loss hence the architecture called Wasserstein GAN. In the case of IDSGAN, the generator changes some features that are very specific to produce generate adversarial traffic data for an attack on IDS. The discriminator is trained to copy the black-box IDS and help and guide with the generator training. The researchers say in the research paper that, “The black-box IDS is implemented by machine learning algorithms to detect attacks. By making the weight parameters of the generator different from the IDS in the training, the adversarial examples can be generated to evade the detection of IDS.” A discriminator’s classification results are used for the training of the generator. For the discriminator training, the adversarial malicious data are the part of KDD datasets.
Experiments And Results
The researchers use PyTorch to implement IDSGAN and are run and evaluated on a Linux PC with Intel Core i7-2600. IDSGAN is trained with the 64 batch size for 100 epochs. The learning rates for both the generator and discriminator are 0.0001 and the dimension of the noise vector is 9. The table below shows the performance of IDSGAN in different attacks.
The results show that the technique introduced by the researchers can effectively fool today’s intrusion detection systems. The researchers said, “In the future, we will further focus on the improvement of IDSGAN. This improvement will concentrate on two aspects: first, we will apply IDSGAN in more categories of intrusion attacks; second, for the definitive aim of the IDS development, the increase of the IDS’s robustness is our critical work.”