Network Node Fault Identification Based On ML - Final
Network Node Fault Identification Based On ML - Final
By
Under Supervision of
1|Pa ge
ECEN464: Network Node Fault Identification based on ML
Table of Contents
Abstract.......................................................................................................................................3
Introduction .................................................................................................................................4
Literature review ..........................................................................................................................6
1. Network Fault Localization................................................................................................6
Rule-based techniques: ........................................................................................................6
Case-based techniques: .......................................................................................................7
Probability-based techniques: ...............................................................................................7
Model-based techniques:......................................................................................................7
2. Types of Faults in WSN......................................................................................................8
Based on their behavior ...........................................................................................................9
Time-Based faults...................................................................................................................9
Based on components .............................................................................................................9
3. The Main Aspects of Faults Management Structure in WSNs ............................................ 12
Error Detection: .................................................................................................................. 12
Error Diagnosis: .................................................................................................................. 12
Error Recovery: ................................................................................................................... 13
4. Machine Learning for Network Fault Management ........................................................... 14
Depend on ML in Anomaly Detection ................................................................................... 14
Depend on ML in Detecting Location-Based Faults ............................................................... 15
Methodology .............................................................................................................................. 16
Results ...................................................................................................................................... 25
Conclusion ................................................................................................................................ 30
References ................................................................................................................................ 31
Appendix ................................................................................................................................... 31
2|Pa ge
ECEN464: Network Node Fault Identification based on ML
Abstract
The increasing connectivity in modern networks introduces significant challenges in managing the vast
volume of data and the potential for link failures, which can disrupt services if not promptly addressed.
Traditional manual fault recovery techniques are often slow and inefficient. This paper presents ML-LFIL,
a machine learning-based method for fault localization and identification, leveraging traffic engineering
principles. ML-LFIL operates in three stages: link fault detection, differentiation between disconnections
and reconnections, and fault location determination. By utilizing Multi-Layer Perceptron neural networks,
Random Forests, and Support Vector Machines, ML-LFIL analyzes traffic metrics gathered through passive
monitoring, thereby avoiding the drawbacks associated with active probing. Extensive experiments across
various network topologies demonstrate that ML-LFIL achieves rapid and accurate fault detection and
Keywords: link fault detection; fault localization; machine learning; fault recovery techniques; fault
location determination.
3|Pa ge
ECEN464: Network Node Fault Identification based on ML
Introduction
Managing networks now faces additional difficulties due to the increase in connectivity,
particularly with regard to the sheer amount of data and the possibility of link failures. If not
immediately fixed, these errors whether they result from disconnections or reconnections can cause
service disruptions. Time-consuming manual fault recovery techniques that rely on programs like
ping and trace-route can be used. As a result, effective fault management systems with quick
diagnosis and recovery times are vital. The algorithms employed and the caliber of the network data
which can be gathered actively or passively determine how accurate fault detection is. Although it is
frequently employed, active probing increases network traffic and latency by having measurement
points in the network exchange control packets. By comparison, passive monitoring finds errors
without adding more overhead by examining the traffic metrics already in place. By examining
network traffic attributes, machine learning especially deep learning offers a chance to address
these issues.
Machine Learning (ML) is especially suitable for managing complex systems. It enables
advanced data analysis and can help achieve network goals, including root-cause analysis and
failure localization, by learning from past data and predicting future responses. Recent
advancements in computational hardware, parallel computing, big data storage, and processing
Function Virtualization (NFV) platforms, have facilitated the application of ML to various networking
challenges.
In this work, we suggest a traffic engineering-based machine learning method for fault
localization and identification. Our method, ML-LFIL, consists of three steps: distinction between
disconnections and reconnections, fault location determination, and link fault detection. We train
4|Pa ge
ECEN464: Network Node Fault Identification based on ML
our model using Multi-Layer Perceptron neural networks, Random Forests, and Support Vector
Machines, which have shown efficacy in classification and regression tasks. Crucially, ML-LFIL
enables rapid fault identification and localization, even in large networks, by learning from real-time
data points. We validate our approach through extensive experiments on various network topologies,
To put it briefly, we have developed a machine learning model to understand traffic behavior
and link faults, adopted a passive monitoring approach, and extensively experimented to
demonstrate the effectiveness of our method. We present performance evaluations, describe our
methodology, summarize related work, and offer suggestions for further research.
5|Pa ge
ECEN464: Network Node Fault Identification based on ML
Literature review
Link failures in networks might cause a link to detach and then reattach itself without any prompt
replacement. For example, if a wireless node switched access points, it becomes difficult to determine
whether a link has failed or been reconnected, as well as to pinpoint the location of the failed link. Using
an active probing strategy result in high communication overhead and latency since it takes a long time to
investigate the network by sending signaling messages on various pathways. Furthermore, the wireless
sensor network is made up of several detection stations, also known as nodes, that work together to
collaboratively perform a variety of tasks, including sensing, communicating, and computing. There is
widespread use of wireless sensor networks in several uses, such as in the fields of medicine, the military,
intelligent security systems, and many more. However, they are having a lot of problems with dependability
and fault tolerance. Many studies are being conducted to improve WSNs' fault tolerance so that they can be
efficiently used in critical applications. In this project, we will use machine learning-based link fault
identification
Many authors presented different techniques that have been developed for localizing link faults. These
techniques are broadly categorized into 4 types: rule-based techniques, case-based techniques, probability-
Rule-based techniques:
It depends on the knowledge base developed by the system experts, which is effectively a
series of if-then statements, or the system's rules. However, neither past experience nor network
dynamics seen from previously unseen traffic behavior can teach these rule-based systems to learn
adaptively [2].
6|Pa ge
ECEN464: Network Node Fault Identification based on ML
Case-based techniques:
Mainly fault diagnosis by case-based techniques depends on the expert and experience
Probability-based techniques:
It proposed methods for defect diagnostics based on likelihood. The related probability
mass functions of the links show where the link faults are located in the network [2].
Model-based techniques:
It builds a mathematical model from a knowledge base to describe the network behaviors.
The model's anticipated traffic patterns are compared with the recently observed network traffic
behaviors. Network errors are identified when observed behaviors diverge from those predicted.
Therefore, in order to diagnose link problems effectively, model-based approaches need precise
7|Pa ge
ECEN464: Network Node Fault Identification based on ML
A wireless sensor network (WSN) is a new type of information acquisition and processing network.
It consists of a large number of low power sensor nodes. The sensor nodes communicate through a wireless
network. WSN has been widely used in mechanical parameter detection, industrial monitoring, mine safety,
8|Pa ge
ECEN464: Network Node Fault Identification based on ML
Faults in WSN can also be classified on the basis of behavior, time, component
and location:
a. Hard Faults: SNs are unable to communicate among themselves due to failure in certain
b. Soft Faults: In soft faults, SN can continue to work even in case of failures but sense,
humidity, cosmic rays, vibrations. These types of errors usually occur once and then disappear
b. Intermittent Fault: These types of errors do not occur continuously, as they appear and vanish
c. Permanent Faults: These faults include built-in defects such as faults in chip manufacturing,
burned out of electronic components. The effects of permanent faults remain until the faulty
d. Potential Faults: This occurs due to depletion of hardware resources which ultimately reduces
the network lifespan. The most common is the energy depletion of the nodes that impacts the
lifetime of the node. SNs require energy for various operations such as data sensing, data
collection, communication and processing. Thus, it is necessary to charge or change the battery
9|Pa ge
ECEN464: Network Node Fault Identification based on ML
An unexpected change in or departure from one or more system characteristics from the
norm, acceptable, or standard is referred to as a defect or fault. There are three types of failures
occur in WSNs: node, network and software failures (see figure 2) [3].
a. Node failure: it occurs in a network due to large numbers of nodes being deployed in harsh
and/or inaccessible outdoor environments. Therefore, they can be destroyed or damaged easily.
As a result, they are readily damaged or destroyed. Furthermore, each node in a network has a
finite amount of energy that can run out. Sensor failures and low power readings could also be
b. Network failure: the SNs collect the information and transmit the data toward the sink through
communication link. Routing plays an important role in this. So, communication links and
routing layer are another cause of faults in WSN. Path faults, Radio interference,
problems also arise due to deployment of enormous amount of SNs simultaneously transmitting
the data on the occurrence of interesting events. This can lead to packet loss. Thus, software
programming should be done in such a way that the applied algorithms can reduce congestion
problems [3,4].
10 | P a g e
ECEN464: Network Node Fault Identification based on ML
c. Software faults: it includes issues brought on by software defects and crashes in the operating
system's processes. While WSNs are frequently impacted by this kind of failure, the likelihood
of it happening is low in comparison to other failures. Understanding a general fault model and
diagnosis methodology is necessary before delving deeper into the fault diagnosis idea [3].
11 | P a g e
ECEN464: Network Node Fault Identification based on ML
The fault management structure in WSNs consists of three stages: error detection, diagnosis,
and recovery as shown in Figure 3. The following subsections describe the three phases of the fault
management framework.
Error Detection:
Error or fault detection refers to identifying any unexpected failure or damaging forces that
affect a network’s or node’s optimum condition. Based on their performance, fault detection
methods are divided into three categories: centralized, self-supervision, and decentralized [5].
Error Diagnosis:
In order to use the fault-tolerance concept correctly, the type of mistake and the problematic
nodes must be identified. It is important to identify the cause, kind, and effects of failures on the
health of the network [50]. Using particular reference nodes in a network at specified geographic
locations to help other nodes locate their location is one well-known method. To find and look into
12 | P a g e
ECEN464: Network Node Fault Identification based on ML
network issues, it is necessary to monitor the WSN. Four types of monitoring exist: proactive,
Error Recovery:
The fundamental definition of "recovery" for WSNs is the reconstruction or restoration of the
network to prevent damaged nodes from impairing its optimal functioning. The process of
substituting an ideal state for a malfunctioning one is known as recovery. Depending on the defect,
two fault recovery techniques forward recovery and backward recovery may be applied [5].
13 | P a g e
ECEN464: Network Node Fault Identification based on ML
Machine learning (ML) for network node fault identification involves the use of algorithms and
techniques to automatically detect, classify, and predict faults or anomalies in network nodes. This
approach leverages historical data, network telemetry, and various features extracted from network
traffic to build models that can identify abnormal behavior indicative of faults or failures. Recently,
several works proposed to use machine learning techniques for network fault management [1, 3.4] .
There are several applications of ML related to fault management in optical networks. The affected
traffic could be restored by starting a restoration procedure, but it would be preferable to foresee these
degradations and identify the root cause of the (soft) failure so that the lightpath can be rerouted before
it is interrupted. It should be noted that failure localization is necessary in order to schedule maintenance
tasks and to exclude the failed resources from path computation. Proactive failure detection would also
provide planners more time to schedule the rerouting process, for example, during off-peak hours [1].
categorized by the authors using Support Vector Machines (SVM). SVM is trained using various kernel
functions in [20]. Increasing the amount of features and training data improved the classifier's
effectiveness, but further increases led to an overfitting issue. Cross validation strategies are
employed to address the overfitting problem. Moreover, expanding the input sample size improves
14 | P a g e
ECEN464: Network Node Fault Identification based on ML
This section is divided into 2 sections. First, section is for detecting data centric faults and second
Sensor defects such as erratic, hard-over, spike, drift, and stuck problems have been
categorized by the authors using Support Vector Machines (SVM). SVM is trained using various kernel
functions in. Increasing the amount of features and training data improved the classifier's
effectiveness, but further increases led to an overfitting issue. Cross validation strategies are
employed to address the overfitting problem. Moreover, expanding the input sample size improves
The paper [4] focuses on communication link failure in WSNs because, even when all the
nodes are operating as intended, choices may still be impacted by communication link issues.
Consequently, the connection Faults may result in separate sets of nodes, which defeats the goal
and general functioning of WSN. The Feedforward Neural Network (FFNN), which can adapt and learn
from gradient decent learning technique, is the foundation of this automatic link failure detection
system. The parameters Packet Delivery Ratio (PDR) and Latency are used to evaluate the quality of
networks. When latency is extremely high and PDR is extremely low, a link is deemed to have failed.
Neural networks use these parameters as input, or features. Testbed experiments conducted both
15 | P a g e
ECEN464: Network Node Fault Identification based on ML
Methodology
Our methodology focuses on the application of Machine Learning techniques, specifically the K-
Nearest Neighbors (KNN) classifier, to identify faults in an IP network using the SOFI dataset.
This dataset includes network performance data collected over 649 hours, with faults induced
Data Preparation
The dataset consists of two separate CSV files, presumably representing data from two different
SOFI CoreSwitch-I.csv
SOFI CoreSwitch-II.csv
The class labels in these datasets, representing healthy (NE) and faulty (F) network states, are
converted from categorical to binary format (F=0, NE=1) for the application of ML algorithms.
data1['class'] = data1['class'].replace({'F':0,'NE':1})
data2['class'] = data2['class'].replace({'F':0,'NE':1})
The dataset includes 34 attributes from which we exclude the class label for input features.
Different models are trained with variations in feature subsets to evaluate their impact on model
performance.
Initial Model: A KNN classifier is trained using all features except the class label.
16 | P a g e
ECEN464: Network Node Fault Identification based on ML
Exclusion of Temporal Features: To evaluate the impact of temporal data (timestamp, range), these
Exclusion of ICMP and Packet-related Features: We train additional models excluding features
that might relate more to network performance variations than faults, such as ICMP ping responses
Further Reduced Feature Set: Further models test the exclusion of an increasingly larger set of
features deemed less relevant or redundant, based on prior analysis or domain knowledge.
model.score(testInputs.__array__(), testResults.__array__())
Hyperparameter Tuning
For the KNN algorithm, the choice of k (number of neighbors) is crucial. We iterate k from 1 to
29 to find the optimal k that maximizes the accuracy of the classifier on the test dataset.
for i in range(1,30):
model = KNeighborsClassifier(i)
model.fit(trainInputs, trainResults)
s = model.score(testInputs.__array__(), testResults.__array__())
results.append([i, s])
To determine the influence of each feature on the network's health classification, we isolate each
feature alongside temporal data, train a model using these features, and record the accuracy.
17 | P a g e
ECEN464: Network Node Fault Identification based on ML
model = KNeighborsClassifier(5)
model.fit(trainInputs, trainResults)
s = model.score(testInputs.__array__(), testResults.__array__())
results[col] = s
Sorted results help identify the most and least predictive features.
18 | P a g e
ECEN464: Network Node Fault Identification based on ML
extended network interface. This parameter highlights potential problems like buffer overflow,
hardware limitations, or configuration errors on the secondary interface, which could lead to data
network interface that contain errors. Errors in inbound packets can result from issues such as
signal degradation, interference, or faulty hardware. This metric is essential for diagnosing data
network interface that contain errors. This parameter is crucial for identifying data integrity issues
on the secondary interface, which can affect the accuracy and reliability of the received data,
P_Bits_received: Quantifies the total number of bits received by the primary network interface.
This metric provides insight into the volume of incoming data, helping to assess the network's
EX_Bits_received: Indicates the total number of bits received by the extended network interface.
This parameter reflects the incoming data volume on the secondary interface, providing valuable
information about the network's ability to manage and distribute traffic efficiently across multiple
interfaces.
19 | P a g e
ECEN464: Network Node Fault Identification based on ML
interface that are discarded. Packet discards can occur due to network congestion, buffer overflow,
or misconfiguration. Monitoring this metric helps in identifying and resolving issues that impede
extended network interface. This parameter indicates potential problems such as network
congestion, hardware limitations, or misconfiguration on the secondary interface, which can affect
network interface that contain errors. Errors in outbound packets can result from hardware faults,
reliability.
network interface that contain errors. This metric is essential for identifying issues affecting the
accuracy and reliability of outgoing data on the secondary interface, which can lead to
communication failures.
P_Bits_sent: Quantifies the total number of bits sent by the primary network interface. This
parameter reflects the volume of outgoing data, providing insight into the network's capacity to
handle outbound traffic and identifying potential issues related to bandwidth or data transfer rates.
20 | P a g e
ECEN464: Network Node Fault Identification based on ML
EX_Bits_sent: Indicates the total number of bits sent by the extended network interface. This
metric provides valuable information about the volume of outgoing data on the secondary
interface, helping to assess the network's ability to manage and distribute traffic efficiently.
P_Speed: Measures the operational speed of the primary network interface, expressed in bits per
second (bps). This parameter indicates the data transfer rate capability of the primary interface,
essential for evaluating the network's performance and capacity to handle high-speed data
transmission.
IN_Speed: Measures the speed of the inbound connection, reflecting the data transfer rate
capability for incoming traffic. This metric is crucial for assessing the network's ability to handle
and process incoming data efficiently, ensuring optimal performance and minimal delays.
EX_Speed: Measures the operational speed of the extended network interface, expressed in bits
per second (bps). This parameter indicates the data transfer rate capability of the secondary
interface, essential for evaluating the network's overall performance and ability to manage high-
P_Operational_status: Indicates the current operational state of the primary network interface,
showing whether the interface is active and functioning properly. This parameter helps in
monitoring the availability and reliability of the primary interface, ensuring continuous network
operation.
EX_Operational_status: Reflects the current operational state of the extended network interface,
indicating whether this secondary interface is active and functioning properly. This metric is
important for ensuring the availability and reliability of the secondary interface, supporting overall
network stability.
21 | P a g e
ECEN464: Network Node Fault Identification based on ML
P_Interface_type: Identifies the type of the primary network interface, such as Ethernet, Wi-Fi, or
fiber optic. This parameter provides context about the physical or logical connection type, helping
in understanding the network's architecture and the capabilities of the primary interface.
IN_Interface_type: Identifies the type of interface used for inbound traffic, providing information
about the physical or logical connection for incoming data. This parameter is crucial for
understanding the network's architecture and the characteristics of the inbound data flow.
EX_Interface_type: Specifies the type of the extended network interface, such as Ethernet, Wi-Fi,
or fiber optic. This parameter provides context about the secondary connection type, helping in
understanding the network's architecture and the capabilities of the extended interface.
Device_uptime: Measures the total time the network device has been operational since its last
restart. This metric is indicative of the device's reliability and stability, helping in identifying
potential issues related to device performance and the need for maintenance or troubleshooting.
SNMP_Availability: Indicates the accessibility of the network device via SNMP. This parameter
is crucial for network management and monitoring, allowing administrators to collect and analyze
that are discarded. Discards can occur due to buffer overflow, misconfiguration, or network
congestion. This metric helps in diagnosing potential issues affecting the network's ability to
interface that contain errors. Errors in inbound packets can result from issues such as signal
22 | P a g e
ECEN464: Network Node Fault Identification based on ML
degradation, interference, or faulty hardware. This metric is essential for diagnosing data integrity
IN_Bits_received: Quantifies the total number of bits received by the inbound interface. This
metric provides insight into the volume of incoming data, helping to assess the network's capacity
that are discarded. Packet discards can occur due to network congestion, buffer overflow, or
misconfiguration. Monitoring this metric helps in identifying and resolving issues that impede the
interface that contain errors. Errors in outbound packets can result from hardware faults, signal
reliability.
IN_Bits_sent: Quantifies the total number of bits sent by the inbound interface. This parameter
reflects the volume of outgoing data, providing insight into the network's capacity to handle
outbound traffic and identifying potential issues related to bandwidth or data transfer rates.
IN_Operational_status: Indicates the current operational state of the inbound interface, showing
whether the interface is active and functioning properly. This parameter helps in monitoring the
availability and reliability of the inbound interface, ensuring continuous network operation.
23 | P a g e
ECEN464: Network Node Fault Identification based on ML
Class: Indicates the health status of the network, with 'F' representing a faulty network and 'NE'
representing a healthy network. This parameter is crucial for categorizing network performance
detection. This parameter offers additional context for the dataset's attributes, helping in the
24 | P a g e
ECEN464: Network Node Fault Identification based on ML
Results
The network node fault identification using machine learning techniques was evaluated
through a series of experiments using a K-Nearest Neighbors (KNN) classifier. The experiments
involved training the model on data from SOFI CoreSwitch-I and testing on data from SOFI
CoreSwitch-II.
The initial model, with k=5, was trained on the full feature set from the first dataset and tested on
To determine the optimal number of neighbors (k), the model was evaluated for k ranging from 1
to 29. The performance of the model was measured and recorded for each k value as follows:
k Score
1 0.9875
2 0.9833
3 0.9907
4 0.9897
5 0.9902
6 0.9898
7 0.9896
8 0.9894
9 0.9888
10 0.9888
11 0.9874
12 0.9873
25 | P a g e
ECEN464: Network Node Fault Identification based on ML
13 0.9865
14 0.9862
The results indicated that the highest accuracy of 0.9907 was achieved with k=3, suggesting that
To assess the impact of feature reduction on model performance, several iterations of the model
Model 2: Dropped features timestamp and range. This model configuration tested the influence of
in an accuracy of 0.9902.
extensive reduction aimed to isolate the most critical features for fault identification and achieved
The feature importance evaluation aimed to identify which network metrics most
significantly impacted the accuracy of the KNN classifier in detecting network node faults. This
analysis was conducted by training and testing the model with each feature individually, alongside
26 | P a g e
ECEN464: Network Node Fault Identification based on ML
the timestamp, and measuring the resulting accuracy. The features were then sorted based on their
The evaluation revealed that several features consistently contributed to high model accuracy,
while some had a slightly less impact. Here are the details:
High-Impact Features:
Each of these features achieved an accuracy score of 0.9849. This suggests that the amount of data
being received and sent across different interfaces (primary, external, internal) is crucial for
- EX_Bits_received, IN_Bits_sent:
These features showed a slightly higher accuracy score of 0.9853, indicating that the volume of
data traffic on external and internal interfaces is highly indicative of network health and potential
faults.
Moderate-Impact Features:
0.9849. This indicates that both range-based metrics and ICMP-related measurements (such as
ping, loss, and response time) are critical for fault detection.
27 | P a g e
ECEN464: Network Node Fault Identification based on ML
P_Inbound_packets_with_errors,EX_Inbound_packets_discarded,
EX_Inbound_packets_with_errors:
These metrics, with scores around 0.9849, emphasize the significance of monitoring errors and
Operational metrics such as device uptime and interface speeds (primary, internal, external) also
played a significant role, suggesting that continuous performance and speed metrics are vital for
fault identification.
Given the high impact of data traffic and error metrics, future models should prioritize
these features. This can help in creating more efficient and accurate models by focusing on the
The relatively close performance scores across different features suggest a degree of
redundancy. Models could potentially achieve similar accuracy with a reduced set of features,
28 | P a g e
ECEN464: Network Node Fault Identification based on ML
ensures that the model can capture a comprehensive picture of network health.
Features related to ongoing operations like device uptime and speed are crucial. Continuous
monitoring of these metrics allows for real-time fault detection and proactive network
management.
The comprehensive list of features demonstrated that various network metrics had a
relatively similar impact on the model's performance, indicating a high level of feature redundancy.
P_Bits_received 0.9848
EX_Bits_sent 0.9848
IN_Bits_received 0.9848
EX_Bits_received 0.9852
IN_Bits_sent 0.9852
29 | P a g e
ECEN464: Network Node Fault Identification based on ML
Conclusion
The final sorted results from the feature importance analysis indicated that network
throughput and packet metrics were the most predictive of network node faults. This provided
Overall, the experiments demonstrated that the KNN classifier could effectively identify
network node faults with varying degrees of accuracy depending on the choice of k and the set of
features used. The feature reduction analysis highlighted the potential for improving model
efficiency without significantly compromising accuracy, while the feature importance evaluation
30 | P a g e
ECEN464: Network Node Fault Identification based on ML
References
[1] L. Velasco and D. Rafique, “Fault management based on machine learning [invited],” Optica Publishing
Group, https://opg.optica.org/abstract.cfm?uri=OFC-2019-W3G.3
[2] Machine learning-based link fault identification and localization in complex networks | IEEE Journals
[3] Fault detection in wireless sensor network based on Deep Learning Algorithms,
https://www.researchgate.net/publication/351285887_Fault_Detection_in_Wireless_Sensor_Netwo
rk_Based_on_Deep_Learning_Algorithms
https://www.researchgate.net/publication/350576388_A_Survey_of_Machine_Learning_for_Netwo
rk_Fault_Management
wireless sensor networks (wsns): Survey, classification, and Future Directions,” Sensors (Basel,
Switzerland), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9415276
Appendix
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
31 | P a g e
ECEN464: Network Node Fault Identification based on ML
model = KNeighborsClassifier(5)
model.fit(trainInputs, trainResults)
model.score(testInputs.__array__(),testResults.__array__())
results = []
for i in range(1,30):
model = KNeighborsClassifier(i)
model.fit(trainInputs, trainResults)
s = model.score(testInputs.__array__(),testResults.__array__())
results.append([i, s])
model2 = KNeighborsClassifier(5)
trainInputs = data1.drop(['class','timestamp', 'range'], axis=1)
model2.fit(trainInputs, trainResults)
testInputs = data2.drop(['class','timestamp', 'range'], axis=1)
model2.score(testInputs.__array__(),testResults.__array__())
model3 = KNeighborsClassifier(5)
trainInputs = data1.drop(['ICMP_ping','ICMP_loss', 'ICMP_response_time',
'P_Inbound_packets_discarded'], axis=1)
model3.fit(trainInputs, trainResults)
testInputs = data2.drop(['ICMP_ping','ICMP_loss', 'ICMP_response_time',
'P_Inbound_packets_discarded'], axis=1)
model3.score(testInputs.__array__(),testResults.__array__())
model4 = KNeighborsClassifier(5)
trainInputs = data1.drop(['ICMP_ping','ICMP_loss', 'ICMP_response_time',
'P_Inbound_packets_discarded',
'P_Inbound_packets_with_errors','P_Bits_received','EX_Inbound_packets_discarded',
'EX_Inbound_packets_with_errors','Device_uptime','IN_Inbound_packets_discarded'], axis=1)
model4.fit(trainInputs, trainResults)
testInputs = data2.drop(['ICMP_ping','ICMP_loss', 'ICMP_response_time',
'P_Inbound_packets_discarded',
'P_Inbound_packets_with_errors','P_Bits_received','EX_Inbound_packets_discarded',
'EX_Inbound_packets_with_errors','Device_uptime','IN_Inbound_packets_discarded'], axis=1)
model4.score(testInputs.__array__(),testResults.__array__())
results = {}
allData = pd.DataFrame(data2)
allData = allData.drop(['class', 'timestamp', 'SNMP_availability'], axis=1)
for col in (allData).keys():
trainInputs = data1[[str(col), 'timestamp']]
testInputs = data2[[str(col), 'timestamp']]
print(str(col))
model = KNeighborsClassifier(i)
model.fit(trainInputs, trainResults)
s = model.score(testInputs.__array__(),testResults.__array__())
results[col] = s
sorted_results = dict(sorted(results.items(), key=lambda x: x[1]))
print(sorted_results)
32 | P a g e
ECEN464: Network Node Fault Identification based on ML
33 | P a g e