This project will develop a self-learning agent based on Deep Reinforcement Learning (DRL) to explore the attack surface of a virtual machine (VM). The project tests the possibility to discover new zero-day vulnerabilities in IT systems, with the aid of deep learning artificial intelligence (AI).
A common task to all SOC teams is to monitor network activity and detect attacks targeting IT systems. They are aided in this task by a series of autonomous IDSs, trained to identify suspicious events. These systems are however developed based on intelligence of identified past threats. This reliance on anterior a priori knowledge has two major drawbacks: it does not address undetected successful attacks, nor does it capture new ones. This situation causes an inherent delay between the detection of new threats and the integration of new relevant intelligence into IDSes, a delay that can sometimes be counted in years. In addition to these shortcomings, detection systems based on Artificial Intelligence (AI) usually require existing data sets to be trained, data sets that are not always straightforward to acquire because of the associated requirements.
On the contrary, the proposed agent would have the ability to proactively discover new, possibly unknown weaknesses in the target IT system. From the learning process new intelligence will emerge; and the corresponding attack vectors targeting these vulnerabilities will possibly create new behaviour. The agent will then allow to adapt and improve the detection capability proactively, to protect the envisaged IT system (here a VM) from as-yet-unknown threats.
The project will produce a DRL-based agent that will explore the attack surface on the chosen target VM. Its success will be measured by its ability to autonomously discover vulnerabilities of the VM.
This project has an inherent risk linked to the difficulty to describe the system state and possible actions, the scale of the space to explore, the design of a reward function and a policy favouring the medium- to long-term objectives. However, upon success, such an agent brings a considerable strategic advantage for the defence of IT systems. The value of a self-learning AI with the ability to discover new zero-days can hardly be overstated, since it allows to develop a detection system with emergent intelligence. This project should be seen as a first step, a proof of concept that could be extended and generalised in the context of a much larger scale project, to explore the attack surface of many systems, and networks. This would lead to a considerable improvement of the security posture of whomever deploys the ensuing detection system to monitor their ICT systems.
The SLATE project will consist on AI-training in a virtualised environment. A specific VM will be deployed on a server, and be subjected to the actions of a DRL-based agent trying to find vulnerabilities. The main outcome of the project is the proof of a concept: the possibility to built a DRL-based agent that is capable of finding vulnerabilities of a IT system without supervision, knowledge injection or reliance on pre-existing attack tools, or attack traces.