The secret of AI-security: delivering slightly less than 100%
Artificial intelligence is rapidly becoming an indispensable part of modern cyber security technologies. However, without decent training, artificial intelligence (AI) can deliver undesired results.
By Wim De Smet, CTO SecureLink Belgium.
Not too long ago, we detected and analyzed malware samples in a sandbox environment. Once the analysis was completed, it became easy to detect and block captured malware. Using new detection rules, computer systems were updated and hence protected against the newest threats.
Nowadays, this approach has become outdated. Hackers can easily and quickly disguise malware. This means that they can reuse one exploit in thousands and even millions of samples. Manually discovering and blocking all of these samples in time, is simply impossible. As soon as a security researcher has dissected a particular type of malware, a new one has already entered the playing field.
Artificial Intelligence is the solution to this conundrum. When SecureLink first started looking at AI as a security enabler, many were doubting the approach. Today, every self-respecting security company is using AI and machine learning to enhance their security solutions. Using machine learning, algorithms can be trained to detect malware, not based on specific signatures but on general traits that all malware possess. By looking at these traits, even completely new malware can be flagged and stopped. AI is then used to detect and stop threats, both known and unknown, in real time.
Three layers of security
We first implemented AI and machine learning in our next gen endpoint protection offering called SecurePrevent Endpoint. Machine learning is not only effective on the end point level. It is useful in all layers of security. A machine learning and AI based sensor could for example use its intelligence in the detection of indicators of malware in network traffic, while a differently trained AI could do something similar while looking at the logs.
The effectiveness of AI driven algorithms is however completely dependent on the quality of the training. It’s important to make sure that the machine learning model you’re using is mature enough. Machine learning in itself isn’t hard. Training models to a level on which they’re reliable enough to be used in critical environments: that is hard.
Put differently: if the data sets you’re using to train your algorithms aren’t representative and of the highest quality, your security solution will teach itself patterns that are wrong. That way, malware can slip trough the cracks, or reliable software can erroneously be blocked as malware. A unique environment therefore calls for tailor-made training and guidance by human experts.
Training your algorithm
Herein lies the strength of an integrator. Imagine a large organization, using home-brewed legacy software for mission critical applications. Software like that often has properties similar to those of malware code. An insufficiently trained AI would therefore be inclined to block it, thereby harming operations. A well thought out integration of AI takes the possibilities of false positives seriously. By first running any solution in a ‘learning mode’, we can evaluate false positives and update the training of a security system. This means we can teach a AI solution to trust certain software, which motivates the algorithm to look for a meaningful difference between the legacy software, and real malware.
The goal is a tailor-made integration of AI security blocking 100 % of known and unknown threats. In the real world, 100 % is difficult to achieve, even though the detection ratio of our SecurePrevent Endpoint comes close. The machine learning model powering the solution gets updated frequently, but even the 2015 model is capable of stopping current malware to this day. Cylance was one of the few solutions capable of blocking the infamous WannaCry and NotPetya-attacks, even before the malware was discovered.
Boosting the result even further to 100% isn’t difficult, but brings with it the uninviting prospect of false positives. These are annoying, have a negative impact on productivity, and even encourage users to turn off their security solution altogether. The real challenge is in getting as close to 100% as possible, without ever going too far.