Classification of Malware Attacks Using Machine Learning in Decision Tree
Classification of Malware Attacks Using Machine Learning in Decision Tree
Abstract
Predicting cyberattacks using machine learning has become imperative since cyberattacks have
increased exponentially due to the stealthy and sophisticated nature of adversaries. To have
situational awareness and achieve defence in depth, using machine learning for threat prediction
has become a prerequisite for cyber threat intelligence gathering. Some approaches to mitigating
malware attacks include the use of spam filters, firewalls, and IDS/IPS configurations to detect
attacks. However, threat actors are deploying adversarial machine learning techniques to exploit
vulnerabilities. This paper explores the viability of using machine learning methods to predict
malware attacks and build a classifier to automatically detect and label an event as “Has
Detection or No Detection”. The purpose is to predict the probability of malware penetration and
the extent of manipulation on the network nodes for cyber threat intelligence. To demonstrate the
applicability of our work, we use a decision tree (DT) algorithms to learn dataset for evaluation.
The dataset was from Microsoft Malware threat prediction website Kaggle. We identify probably
cyberattacks on smart grid, use attack scenarios to determine penetrations and manipulations.
The results show that ML methods can be applied in smart grid cyber supply chain environment
to detect cyberattacks and predict future trends.
1. INTRODUCTION
The unpredictable nature of cyberattacks and the cascading effects of cybercrimes on the
business system have made it difficult for organizations to predict endpoint attacks. ML assist in
recognizing attack patterns using datasets of previous attacks to predict future attacks trends and
responses [1]. Endpoints are the third-party vendor systems, workstations, servers, handheld
mobile devices and AMI devices. Malware attacks have intensified by the distributed nature of the smart
grid in supply chain systems. Adversaries are using cyberattacks such as cross site scripting, cross
site request forgeries, session hijacking and remote access trojan attacks to commit cybercrimes
such as modification of software, manipulating of online services, manipulations electronic
products, diverting e-products and other security misconfigurations. Ford and Siraj 2015,
highlighted different issues in the applications of machine learning in cybersecurity by detecting
phishing, network intrusion, testing security properties of protocols and smart energy
consumptions profiling [2].
to learn dataset to detect power system disturbance and cyberattack discrimination [1]. Mohasseb
et al. 2019 applied ML techniques to analyze a dataset from various organizations to improve
classification accuracies [5]. These works are important and contribute to detecting and predicting
cyberattacks using machine learning in the cybersecurity domain. However, there is a limited
focus on smart grid vulnerability from supply chain perspective, and specifically on threats relating
to inbound and outbound chain contexts that need adequate detection to improve smart grid
security control and decision makings.
In this paper, we use ML techniques to learn datasets and build a classifier to automatically
detect and label an event as Has Detection or No Detection. The rationale for choosing the DT
algorithm is that DT represents the major supervised schemes for ML in network security. We use
a dataset from Microsoft malware prediction [6] for our work. To demonstrate the effectiveness of
our approach, we adopt the decision tree algorithm to evaluate our data sets based on the attack
classifications.
The main contribution of this paper is threefold. Firstly, we identify probably cyberattacks on the
smart grid and the vulnerable sports that could be exploited through penetration and
manipulations base on the telemetry dataset. Secondly, we use attack scenarios to determine the
penetration and the manipulations for the threat predictions on the endpoint nodes. Finally, we
use ML techniques to learn the dataset and use the DT algorithm to predict whether the endpoint
nodes can classify if the nodes can detection cyberattack or not using Has Detection or No
Detection. The results show that ML algorithms in Decision Trees methods could be applied in
smart grid supply chain predictive analytics to detect cyberattacks and predict future trends.
The rest of the paper is structured as follows: Section 2 presents an overview of related works in
the machine learning in smart grid supply security domain and the existing classification
algorithms. Section 3 considers our approach to evaluating the ML techniques to learn dataset
and the classification algorithms for smart grid supply chain, CPS smart grid infrastructure and
the vulnerable spots and probable attacks scenarios. Further, it discusses the data
representation, feature descriptions and extractions as well as the classification algorithm.
Section 4 presents the implementation of the machine learning simulation process, performance
evaluation on the classifier and determines the average accuracy of the model and predict the
probability of penetrations on the endpoint nodes. Section 5 presents the results and analysis of
the DT that predicts the cyberattack initiated and the cybercrimes committed or not. Further, we
provide discussions of the several observations identified in the study. Finally, section 6 presents
a conclusion of the study, comparisons of existing works, limitations and future works.
2. RELATED WORKS
This section reviews related works and the state of the art of cybersecurity in machine learning
predictions, decision tree classifications and how they are related to malware attacks on CPS
environment. That includes identification of previous classification approaches, leveraging the
classifications of malware with a specific data set and prediction task used. Sharmar et al. 2012
proposed an ML technique for detecting worm variants of known worms in real-time systems [7].
Tsai et al. 2009, proposed a review of the intrusion detection system by using ML techniques and
various classifiers on the intrusion detection domain [8]. Wang et al 2014. Performs an empirical
study of adversarial attacks against ML models in the context of detecting malicious
crowdsourcing systems [9]. Bilge et al. 2017, proposed a risk teller system that predicts cyber
incidents by analyzing malicious files and infection records according to the endpoint protection
software installed to determine machines that are at risk [10]. Canali et al. 2014 performed a
correlation analysis on the effectiveness of risk prediction based on user browsing behaviour by
leveraging ML techniques to provide a model that can be used to estimate the risk class of a
given user [4]. Barros 2015 posits that decision threats and induction methods in general, arose
in machine learning to avoid acquisition bottleneck for expert systems [11]. Villano, 2018,
proposed a method of classification of internet logs using ML techniques by correlation and
normalization process and evaluated the DT algorithm that could predict an attack or not [12].
Soska & Christin 2014, proposed a complementary approach to automatically detect vulnerable
websites before they turn malicious by design, implement and evaluate a novel classification
system which predicts whether a given website could be compromised in future [3]. Hinks et al.
2014, proposed an ML technique for power system disturbance and cyberattack discrimination by
evaluating various ML methods for an optimal algorithm that is accurate in its classifiers to predict
disturbance discriminators and implications [1]. Yavanoglu et al. 2017, proposed a review of
cybersecurity datasets for ML algorithms by analyzing network traffic and detecting abnormalities
used for experiments and evaluation methods considered as baseline classifiers for comparisons
[13].
Branch
Internal
Node
discrete outputs as the factors are provided by attributed value pairs for strategic management
decision makings. E.g. Results: Pass or Fail. Cyberattack: Internal or external. Temperature: hot
or cold. Outcome: Positive or Negative.
DT algorithm can identify attributes pairs that were not considered initially in the
classification such as the source of attack but could work without those attributes to
minimize inferred errors.
DT algorithm can handle datasets that have errors in the attribute values and resolve
classification errors in the training and test phase. Such as false positives (FP) output
when network traffic is a normal or false negative (FN) when network traffic is under
cyberattack. The discrete probability outputs provide results that predict a ‘True or False’,
‘Yes or No’ and ‘A or B) outcomes.
3. APPROACH
This section considers our approach to evaluating the ML techniques to learn dataset and the
classification algorithms for the smart grid supply chain. We discuss the smart grid infrastructures
and the vulnerable spots, attack scenarios and the ML approach. The rationale for the ML
approach using DT to predict an attack is to determine the causal relationships amongst the
cyberattacks on a smart grid supply chain system and attempt to predict the malware using
probability distribution methods. Then based on the classification analysis, we evaluate the
predictive method with appropriate metrics to verify the organizational goal and security goal as
we seek to determine whether a specific cyber threat phenomenon is likely to appear in a similar
event. There are some algorithms for building decision trees such as ID3 and C4.5 formula and
others [15]. We discuss the ML methods used, as well as the approaches used for the malware
prediction.
Command Center
Workstation SCADA Server Snort
SCADA Network
workstation
Sub Station
Switchboard IED Firewall
WAN Server
Communication Network
Third Party Firewall Router Threat Actor 3
Vendors
CSC Vendors
Systems Cyber Physical
Threat Actor 2
Threat Actor 1
Generation
The adversary could cause cyberattacks (penetration) and cybercrimes (manipulation) on the
CPS. Cyberattacks such as remote access trojan, spear phishing, cross site scripting or session
hijacking on the intelligent devices and communications networks to penetrate firewalls, IDS/IPS
or the IEDs. After penetrating the system, the adversary could commit cybercrimes by
manipulating the system to cause resonance attacks, DDoS attacks, IP theft, ID theft, intellectual
property theft as well as take command and control to monitor and control the core business
processes and operations. We include these attack scenarios in the analysis to determine the
validity of the penetration and manipulations in real-time.
Network Attack: An XSS or session hijacking attack on the CSC network may provide
access to alter the smart metering system, change configurations using distance
protection scheme to bypass controls in order to manipulate the software in the meter,
prevent the system from recording accurate purchases or billings.
Spyware Attack: The attacker could insert spyware or deploy a ransomware attack
remotely to shut the systems down when the antivirus is outdated, and the software is
unpatched, and consequently affect the prepaid card settings change the configurations
using distance protection scheme so an attacker can manipulate and prevent accurate
readings from valid purchases.
Ransomware Attack: The attacker could use reconnaissance and social engineering
tactics to gather intelligence and subsequently initiate a spear phishing attack on targeted
users to shut the system down until a ransom is paid.
Software Manipulation Attack: Most organizations fail to change the hard-coded
password after buying software off the shelf. The attacker could deploy session hijacking
techniques to exploit this vulnerability using advanced persistent threats and command &
control techniques to manipulate the system and consequently cause cybercrimes such
as intellectually property theft, ID theft and industrial espionage.
DDoS or Data Injection: Attacker deploys DDoS attack that could consequently cause
voltage surges by inserting a rootkit into the OS server to cause resonance attack on the
smart grid components for the power system to oscillate.
Island Hopping attack: On the CSC systems, vendors are more susceptible to
cyberattacks, and the perpetrators are using RAT and Island-hopping attacks to gain
access to the major organizations on the supply chain.
Malware: The attacker could insert malware or spyware in the software that is bought off
the shelf that gives the developers access to the system whenever users are prompted to
update their software. That may cause software errors and subsequently lead to
application system manipulations.
performance. Using DT hierarchical data structures for supervised learning provides input space
that is split into local regions in order to predict the dependent variable for decision makings [11].
4. IMPLEMENTATION
This section discusses the implementation of the machine learning simulation process. The
purpose of the study is to use the DT algorithm to predict cyberattack and indicate it as Has
Detection or No Detection. The dataset and the machine malware infections were gathered by
Microsoft Defender endpoint protection [6]. The dataset corresponds to a machine identifier that
provides results as to whether the Microsoft endpoints can predict if it can detect malware attacks
on the nodes. As discussed in section 2, the DT algorithm learns from data sets to approximate
an ‘if then else’ decision rules and generate branches for the tree nodes and decision nodes. We
follow the process below to build the DT classifier for our prediction.
Windows Defender. The rationale for using the dataset for our work is that the dataset does not
represent Microsoft customers machine only as it has been sampled to include a much larger
proportion of malware machines. Thus, we used the dataset for our work to determine whether
the has detection or no detection on various network nodes for threat predictions. Below are
some of the features from the metadata that are relevant for our work [6].
(2)
5. RESULTS
In this section, we present the analysis of the investigation of threat prediction to the two
scenarios for the classification results. We discuss adversarial ML briefly and how adversaries
use ML techniques to exploit vulnerabilities. As discussed in section 3.3, the scenarios use the
DT algorithm to predict the cyberattack initiated and the cybercrime committed.
SCENARIO DT PREDICTIONS
ACCURACY 83% 100%
CYBERATTACKS P R F RESULTS
XSS/Session Jacking 0.89 0.41 0.75 82%
Spyware/Ransomware 0.89 0.58 0.85 87%
Spear Phishing 0.81 0.37 0.71 75%
Session Hijacking 0.71 0.39 0.64 65%
Rootkit/DDoS 0.66 0.37 0.68 55%
RAT/Island Hopping 0.67 0.30 0.74 68%
Ransomware/Malware 0.89 0.55 0.71 85%
Malware/Spyware 0.87 0.58 0.78 84%
DDoS 0.78 0.36 0.65 66%
6. DISCUSSION
Predicting cyberattacks in real-time is challenging due to factors such as type of OS being used,
system refresh rates, time zones, running updates and data in transition. Attacks such as
Ransomware or malware may impact on the system based on the OS being used, the origin of
the attack and due to the time zone. Threat actors could use adversarial machine learning
techniques to exploit vulnerabilities in ML threat predictions
Ransomware attacks could affect CSC system platforms that use multiple smart screens monitors
by infecting a single screen and may propagate to others on the monitors with the same network
nodes and lock the screens during run time. Section 4.1 describes the dataset and how each row
in the dataset corresponds to a machine unique identified by a Machine Identifier gathered from
global machines that use Microsoft Windows Defender. The FLocker ransomware infects smart
screens and avoids detections as the code is always being rewriting to improve its routine
variants and meet changing trends. When launched, the malware identifies the country ID, the
machine ID and activates depending on the motives and intents of the adversary. Figure 4
identifies the vertical resolution rates of the various systems and how the infections propagate
through the systems during run time. The Y-axis indicates the extent of vertical infections and the
X-axis indicates the resolution rates of the infected systems. Malware or ransomware that is
embedded directly into the requested web page in the attack could propagate to other systems.
If the dataset (D) contains examples from n class, then the Gini index, gini(D) is defined
as:
(3)
Where pi is the probability of an object being classified to a particular class that infected.
If a dataset (D) is split on the root (R) into two sets subsets D1 and D2 the gini index (D) is
expressed as:
(4)
The Reduction in Impurity for the split in the dataset was calculated as:
Gini(R) = gini(D) – giniR (D) (5)
From the DT algorithm, we calculate the information gained after the malware (M) infection trend
test is applied on the smart screen for the classification. A weighted sum of Gini Indices was
calculated using the DT and generated the Has Detection and No Detection tree.
Figure 6 depicts the DT indicating the results of the gini index used to measure the probability of
infections of a ransomware attack that may be wrongly classified. The DT root indicates a smart
screen rate of <= 6.5 with a split Gini of 0.5 indicating an equal distribution of the dataset. The
root of the three has an initial dataset of 4000 as the sample size. The DT algorithm split the
value into two sets: [1973, 2027]. From the analysis, 1973 were identified as has detection, hence
are not vulnerable to the attacks. However, 2027 were found to have no detection hence
vulnerable to malware or ransomware attacks. The branch with has detection is indicated as
(True) and the other with no detection is indicated as (False). identified as has identified as a
country identifier with the class Has detection, identified the values of 2531 and 1648 from a
sample size of 3531. A sample size of 2955 has antivirus product installed. However, the total
physical RAM has no detection rate of 1458. The DT split the sample size further till the values
were at the threshold. Figure 5 depicts the gini index calculated and information gained after the
DT test is applied.
The results from Scenarios 1 and 2 provides cyber threat intelligence as to what could happen in
the event of a cyberattack without the classifications of the detections rates in Figure 5.
Scenario 1 predicted a higher probability of the penetration’s attacks on the endpoint nodes
after determining the harmonic mean between the Precision, Recall and F-Score with a
percentage score of:
The results indicate the extent of manipulation to other integrated network systems could
be high with an average accuracy of 85% in a given event.
The extend of manipulations indicates the relevance of the classification of the
cyberattack. The threat intelligence indicates that it could result in cyberattacks such as
Industrial Espionage, Intellectual property theft, Advanced Persistent Threat and
Command & Controls.
Further, various DT algorithms, models and techniques have been implemented using a various
dataset for building intrusion detections, anomaly detection and threat predictions. Pournouri et al
(2017) proposed a cyber attack analysis using decision tree techniques to learn an open source
intelligence dataset for prediction and for improving cyber situational awareness [21]. Patel and
Prajapati (2018) proposed a study and analysis of decision tree based classification using ID3,
C4.5 and CART algorithms to learn a dataset to determine the best performance accuracy [22].
Moon et al (2017) proposed an intrusion detection system based on a decision tree using
analysis of attack behaviour information to detect the possibility of intrusion for preventing APT
attacks [23]. Sarker et al (2020), presented a machine learning intrusion detection system based
security model called “IntruDTree” that evaluated various algorithms on a dataset by ranking the
security features according to their importance then build a generalized tree for detecting
intrusions [24]. Das & Morris (2018) presented a survey of machine learning and data mining
methods for cybersecurity applications and analytics for intrusion detection and traffic
classifications in emails by evaluating the various classifications algorithms on a dataset for
performance accuracies [25]. Balogun & Jimoh (2015) proposed a hybrid of DT and KNN
algorithms to detect anomaly intrusions [26]. Malik et al (2018) used a hybrid of DT pruning and
BPSO algorithms for network intrusion detection [27]. Rai et al (2016) proposed C4.5 DT
algorithm to construct a model for intrusion detection [28]. Yeboah-Ofori & Boachie (2019)
present a malware attack predictive analytics using various ML Classification algorithms in a
majority voting for performance accuracies [29]. Ingre et al (2017) proposed a DT algorithm that
classifies an IDS dataset as normal or attack after the learning and testing the dataset [30]. Relan
and Patil (2015) used a variant of C4.5 DT algorithm to implement an IDS by considering discrete
values for classifications [31].
However, none of the works explored the viability of using machine learning methods to predict
malware attacks and build a classifier to automatically detect and label an event as Has Detection
or No Detection on smart grid supply chain domain to predict the probability of penetration and
the extent of manipulation on the network system nodes for cyber threat intelligence and
situational awareness.
7. CONCLUSION
Our work focused on using ML to learn dataset and used the DT algorithm to determine whether
the classifier can predict an attack and label the attack as Has Detection or No Detection. In this
paper, we have used a malware prediction dataset from a well-known source learn the dataset.
We have used the DT algorithm to model the infections. Although, other algorithms can perform
the same task that the DT could handle datasets that have errors in the attribute values and
resolve classification errors in the training and test phase. Based on our result, the precision was
83% accurate and concluded that supervised learning model performed better in our predictions.
Description of objects may include attributes based on measurement or subjective judgement,
both of which might give rise to errors in the values of the attributes. Some of the objects in the
training set may even have been misclassified. Take, for instance, a malware attack classification
rule from a collection of cyberattacks events. An attribute might test for the presence of
propagation of attack that might give a positive or negative reading at some point. However,
questions remaining to be addressed as to what performance evaluation methods could provide
the best performance indicators for threat predictions and cyber threat intelligence gatherings that
could provide security control mechanisms. There are limitations in our work, such as comparing
other classification algorithms for predictive analytics due to the invincibility nature of cyberattacks
and the cascading impacts on other system nodes.
Future Works
Future research will focus on using ML techniques on various classification algorithms to learn
the dataset for anomaly detection and to predict cyberattacks trends. The approach will assist to
determine the best performance metrics, for cyber threat intelligence and predict future trends.
8. REFERENCES
[1] C. R. B. Hink, J. M. Beaver, M. A.. Bukner, T. Morris, U. Adhikari S. Pan. “Machine Learning
th
for Power System Disturbance and Cyber-attack Discrimination” 7 International Symposium
on Resilient Control Systems. IEEE Xplore. 10.1109/ISRCS.2014.6900095. (2014).
[2] V. Ford. A. Siraj. “Application of Machine Learning in Cyber Security”. Conference Paper.
Computer Science Department. Tennessee Tech University. (2014).
[3] K. Soska, N. Christin. “Automatically Detecting Vulnerable Websites Before They Turn
rd
Malicious. In Proceeding of the 23 UNENIX Security Symposium. Carnegie Mellon
University. ISBN 978-1-931971-15-7 (2014).
[4] D. Canali, L. Bilge, D. Balzarotti. “On the Effectiveness of Risk Prediction Based on User
Browsing Behaviour”. ACM 978-1-4503-2800-5/14/06.
http://dx.doi.org/10.1145/2590296.2590347. (2014). [Accessed 20/04/2020].
[5] A. Mohasseb, B. Aziz, J. Jung, and J. Lee, “Predicting Cyber Security Incidents Using
Machine Learning Algorithms: A case study of Korean SMEs”. University of Portsmouth
Research Portal. (2019).
[7] O. Sharma, M. Girolami J. Sventek, “Detecting Worm Variants using Machine Learning”.
DOI: 10.1145/1364654.1364657 (2007).
[8] C. Tsai, Y. Hsu, C. Lin, W. Lin. “Intrusion detection by machine learning: A review Expert
Systems with Applications”. 36.10, pp. 11994-12000, (2009).
[9] G. Wang. T. Wang. H. Zheng, B. Y. Zhao. “Man vs. Machine: Practical Adversarial Detection
of Malicious Crowdsourcing Workers”. In Proceedings of the 23rd USENIX Security
Symposium San Diego, CA, pp. 239–254, (2014).
[10] L. Bilge, Y. Han, M. D. Amoco, Risk Teller: Predicting the Risk of Cyber Incidents. ACM
ISBN 978-1-4503-4946-8/17/10. https://doi.org/10.1145/3133956.3134022 CCS (2017).
[Accessed 14/12/2019].
[13] O. Yavanoglu. M. Aydos. “A Review of Cyber Security Dataset for Machine Learning
Algorithms”. International Conference on Big Data, IEEE Xplore. DOI:
10.1109//BigData.2007.8258167. (2018).
nd
[14] A. Boschetti. L. Massaron. “Python Data Science Essentials”. 2 Edition. UK. ISBN 978-1-
78646-213-8. (2016).
[15] J. R. Quinlan. “C4.5: Programs for Machine Learning”. 16, 2333-240 Department of
Computer, John Hopkins University, Baltimore. MD21218. (1994).
[16] W. Wang, Z. Lu, “Cyber Security in Smart Grid: Survey and Challenges”. Elsevier. (2013).
[17] A. Yeboah-Ofori, S. Islam. “Cyber Security Threat Modeling for Supply Chain Organizational
Environments”. Future Internet, 11, 63, doi: 10.3390/611030063, (2019).
[18] Controller and Audit General: Investigation. “Wannacry Cyber-attack and The NHS”.
Department of Health. National Audit Office. UK (2017).
[19] A. Yeboah-Ofori. Islam, S. Brimicombe A: Detecting Cyber Supply Chain Attacks on Cyber
Physical Systems Using Bayesian Belief Network. International Conference on Cyber
Security and Internet of Things. (2019). DOI 10.1109/ICSIoT47925.2019.00014.
[20] Duan, E. (2016). FLocker Mobile Ransomware Crosses to Smart TV. Trend Micro. Security
Intelligence Blog. https://blog.trendmicro.com/trendlabs-security-intelligence/flocker-
ransomware-crosses-smart-tv/ [Accessed 10/03/2020].
[21] S. Pournouri, B. Akhgar, P. S. Bayerl. “Cyber Attacks Analysis Using Decision Tree
Techniques for Improving Cyber Situational Awareness” International Conference on Global
Security, Safety and Sustainability. Springer. Vol.360. 2017. DOI: 10.1007/978-3-319-51064-
4_14.
[22] H. Patel, P. Prajapati. “Study and Analysis of Decision Tree Based Classification Algorithms”
International Journal of Computer Science and Engineering. 2018. DOI:
10.26438/ijcse/v6i10.7478.
[23] D. Moon, H. Im, I. Kim, J. H. Park. “DTB-IDS: An Intrusion Detection System Based on
Decision Tree Using Behavior Analysis for Preventing APT Attacks” Springer, The Journal of
Supercomputing 73 2881-2895. 2017. DOI: https://doi.org/10.1007/s11227-015-1604-8.
[25] R. Das, T. Morris. “Machine Learning in Cyber Security”. IEEE Xplore. International
Conference on Computer, Electronic and Communication Engineering. 2018.
DOI: 10.1109/ICCECE.2017.8526232.
[26] A. O. Balogun, R. G. Jimoh. “Anomaly Intrusion Detection Using in Hybrid of Decision Tree
And K-Nearest Neighbor”. Journal of Advances in Scientific Research & Application. 2015.
[27] A.J. Malik, F. A. Khan. “A Hybrid Technique Using Binary Particle Swarm Optimization and
Decision Tree Pruning for Network Intrusion Detection”. Cluster Computing. 21, 667–680.
2018. doi.org/10.1007/s10586-017-0971-8.
[28] K. Rai. M. S. Devi, A. Guleria. “Decision Tree Based Algorithm for Intrusion Detection”.
International Journal Advanced Networked Applications. Vol 7, Issue 04. Pages: 2828. 2016.
[29] A. Yeboah-Ofori, C. Boachie. “Malware Attack Predictive Analytics in a Cyber Supply Chain
Context Using Machine Learning” IEEE Explore. CSIoT pp. 66-77 2019, doi:
10.1109/ICSIoT47925.2019.00019.
[30] B. Ingre, A. Yadav, A. K. Soni “Decision Tree Based Intrusion Detection System for NSL-
KDD Dataset”. International Conference on Information and Communication Technology for
Intelligent Systems. 25–26, pp. 207–218. 2017.