Explainable AI
Explainable AI
This report questions the limitation of current machine learning techniques for detecting
attacks. We started looking into systems that combine semantics with machine learning
techniques in the literature. Computer scientists [1] adapted the military concept of kill
chain, a systematic process to attack an adversary, to describe the stages of cyberattacks.
Similar kill chain models (a.k.a. intrusion or cyber kill chains) are often used to represent
attacks at a high level [2] [3] [4]. This is why we looked at such models as a possible line
of approach for our work.
Problem
Previous work on AI for cybersecurity managed to improve detection accuracy by working
on the statistical aspect of a system (feature extraction [5] [6], models combination [7]).
Current tools may raise many false alarms. When dealing with a series of false alarms,
human experts may overload and consider all the following alarms as false as well.
Security analysts often consider a flow to be either legitimate or illegitimate. Dealing with
the grey zone can be tricky. For example, a network scan can be legitimate if it is done by
an admin. In other cases, it can be the starting point of an attack. A human expert needs
elements of context to decide if it is a false alarm or if we must remain on our guard.
Current security tools won’t be able to make the nuance. In addition, a human analyst is
able to relate events with the goal of an attacker (stealing data, denying service). That is
why we want to investigate how we can add semantics to our system.
State-of-the-art
[11] uses an ontology to describe the communication between a vehicle and the rest of the
world (entities). The use of an ontology allowed reducing the number of features
(extracting features for flows, frames, packets). Then, using inference rules eased the
anomaly detection and built contextual information.
[12] considered network traffic as a character stream of words and assumed normal and
attack traffic can be distinguished by a sequence of words. They encoded raw traffic data
(rather than feature vectors to preserve the semantics) and used a classifier to distinguish
normal and attack traffic. They also proposed an encoding for feature vectors, but the
anomaly detection performance was worse than with encoded raw data. Maybe the raw
data contained something that was not in the extracted features that helped the detection?
[13] describes a system combining a knowledge model with multiple machine learning
models. The system uses a taxonomy of attack types and subtypes. Their ontology allows
navigating between those attack types and queries the relevant machine learning models.
The process to classify an attack is the following: first, the system operates an
attack/normal separation. Then, the system processes the taxonomy to select the relevant
attack classes and the appropriate models to detect those classes. Finally, the ontology is
queried to select all the possible sub-types of the attack and their related models.
Attack lifecycle
Some seemingly benign operations may be steps of an attack (e.g. scans or SMB
communication). Raising an alert early enough to give security analysts time to respond
while avoiding flooding them with false alerts is difficult.
In order to detect and prevent cyberattacks, work has been done to describe their lifecycle
[3] [2]. The MITRE ATT&CK Framework [3] describes the steps of an attack and lists
techniques associated with each step. [2] proposes a system that detects ongoing attacks
from low-level events and generates a high-level graph that summarizes the attack steps.
Currently, our models don’t use previous detections to support the current detection.
Hierarchical temporal memory (HTM), used in [11], keeps and uses information from
previous stages to detect anomalies.
An attack may consist of a rare succession of frequent events. As we said, other works
tried to identify stages that are common to every attack ([3] describes multiple attack
scenarios). We are wondering if an IA can extract knowledge about attack patterns, and
follow the attack in an automated way.
We will look into works that generalize the functioning of an attack and try to characterize
sequences of packets.
Conclusion
We believe current machine learning techniques alone may not be enough to catch the
logic of a cyberattack. This is why we started investigating works that combine semantics
with machine learning techniques. Previous work intended to generalize the lifecycle of an
attack. Mapping attack stages with low-level events allow presenting the information in a
more understandable way for a human expert.
However, we wonder if there are attacks that can escape the logic described in the
literature. For this reason, we would like to investigate the use of machine learning
techniques to identify attack patterns.
References
1. Hutchins, E. M., Cloppert, M. J., & Amin, R. M. (2011). Intelligence-Driven Computer
Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains.
14.
2. Milajerdi, S. M., Gjomemo, R., Eshete, B., Sekar, R., & Venkatakrishnan, V. (2019).
HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows.
2019 IEEE Symposium on Security and Privacy (SP), 1137–1152.
https://doi.org/10.1109/SP.2019.00026
5. Kiran, M., Wang, C., Papadimitriou, G., Mandal, A., & Deelman, E. (2020). Detecting
anomalous packets in network transfers: Investigations using PCA, autoencoder and
isolation forest in TCP. Machine Learning, 109(5), 1127–1143.
https://doi.org/10.1007/s10994-020-05870-y
6. Dromard, J., Roudière, G., & Owezarski, P. (2017). Online and Scalable Unsupervised
Network Anomaly Detection Method. IEEE Transactions on Network and Service
Management, 14(1), 34–47. https://doi.org/10.1109/TNSM.2016.2627340
7. Vanerio, J., & Casas, P. (2017). Ensemble-learning Approaches for Network Security
and Anomaly Detection. Proceedings of the Workshop on Big Data Analytics and Machine
Learning for Data Communication Networks, 1–6.
https://doi.org/10.1145/3098593.3098594
8. Holzinger, A., Kieseberg, P., Weippl, E., & Tjoa, A. M. (2018). Current Advances, Trends
and Challenges of Machine Learning and Knowledge Extraction: From Machine Learning
to Explainable AI. In A. Holzinger, P. Kieseberg, A. M. Tjoa, & E. Weippl (Eds.), Machine
Learning and Knowledge Extraction (Vol. 11015, pp. 1–8). Springer International
Publishing. https://doi.org/10.1007/978-3-319-99740-7_1
9. Kathareios, G., Anghel, A., Mate, A., Clauberg, R., & Gusat, M. (2017). Catch It If You
Can: Real-Time Network Anomaly Detection with Low False Alarm Rates. 2017 16th IEEE
International Conference on Machine Learning and Applications (ICMLA), 924–929.
https://doi.org/10.1109/ICMLA.2017.00-36
10. Le, D. C., & Zincir-Heywood, N. (2021). Anomaly Detection for Insider Threats Using
Unsupervised Ensembles. IEEE Transactions on Network and Service Management,
18(2), 1152–1164. https://doi.org/10.1109/TNSM.2021.3071928
11. Ricard, Q., & Owezarski, P. (2020, January). Ontology Based Anomaly Detection for
Cellular Vehicular Communications. 10th European Congress on Embedded Real Time
Software and Systems (ERTS 2020).
12. Wu, Z., Wang, J., Hu, L., Zhang, Z., & Wu, H. (2020). A network intrusion detection
method based on semantic Re-encoding and deep learning. Journal of Network and
Computer Applications, 164, 102688. https://doi.org/10.1016/j.jnca.2020.102688
13. Sarnovsky, M., & Paralic, J. (2020). Hierarchical Intrusion Detection Using Machine
Learning and Knowledge Model. Symmetry, 12(2), 203.
https://doi.org/10.3390/sym12020203