On The Classification of Fog Computing Applications - A Machine Learning Perspective
On The Classification of Fog Computing Applications - A Machine Learning Perspective
A R T I C L E I N F O A B S T R A C T
Keywords: Currently, Internet applications running on mobile devices generate a massive amount of data that can be trans-
Fog computing mitted to a Cloud for processing. However, one fundamental limitation of a Cloud is the connectivity with end
Edge computing
devices. Fog computing overcomes this limitation and supports the requirements of time-sensitive applications
Cloud computing
by distributing computation, communication, and storage services along the Cloud to Things (C2T) continuum,
Internet of things
Scheduling
empowering potential new applications, such as smart cities, augmented reality (AR), and virtual reality (VR).
Classes of service However, the adoption of Fog-based computational resources and their integration with the Cloud introduces
Quality of service new challenges in resource management, which requires the implementation of new strategies to guarantee
Machine learning compliance with the quality of service (QoS) requirements of applications.
Feature selection In this context, one major question is how to map the QoS requirements of applications on Fog and Cloud
Attribute noise resources. One possible approach is to discriminate the applications arriving at the Fog into Classes of Ser-
Classification algorithms vice (CoS). This paper thus introduces a set of CoS for Fog applications which includes, the QoS requirements
that best characterize these Fog applications. Moreover, this paper proposes the implementation of a typical
machine learning classification methodology to discriminate Fog computing applications as a function of their
QoS requirements. Furthermore, the application of this methodology is illustrated in the assessment of clas-
sifiers in terms of efficiency, accuracy, and robustness to noise. The adoption of a methodology for machine
learning-based classification constitutes a first step towards the definition of QoS provisioning mechanisms in
Fog computing. Moreover, classifying Fog computing applications can facilitate the decision-making process for
Fog scheduler.
1. Introduction power and storage space (Byers, 2017), others such as mission-critical
require strict response time. Mobile users need to have continuous
Cloud computing enables ubiquitous access to shared pools of con- access to applications when on the move. Moreover, IoT devices and
figurable resources and services over the Internet that can be rapidly sensors generate large amounts of data. Not all data need to be sent to
provisioned with minimal management effort (Mell and Grance, 2011). the Cloud while some data have to be processed immediately.
However, with the increasing relevance of the Internet of Things (IoT), Fog computing aims at coping with these demands by hosting
mobile and multimedia applications, the transfer delays between the Cloud services on connected heterogeneous devices, typically, but
Cloud and an end device have been deemed too long and not suitable not exclusively located at the edge of the network (Bonomi et al.,
for latency-sensitive applications, making the main limitation in the 2012; Wang et al., 2019). The Fog provides a geographically dis-
use of the Cloud (OpenFog Reference Architecture, 2017; Alkassab et tributed architecture for computation, communication, and storage,
al., 2017) for latency-sensitive and mobile applications (Bittencourt et which targets real-time applications and mobile services. End-users ben-
al., 2017; Hu et al., 2017; Kumari et al., 2019). efit from pre-processing of workloads, geo-distribution of resources, low
The plethora of applications running on the Internet has hetero- latency responses, device heterogeneity (Bonomi et al., 2012), and loca-
geneous processing and communications demands, as well as Quality tion/content awareness (Deng et al., 2016). The Fog can support the
of Service (QoS) requirements. While multimedia demand processing diversity of applications requirements in the Cloud to Things (C2T)
∗ Corresponding author.
E-mail addresses: [email protected] (J.C. Guevara), [email protected] (R.S. Torres), [email protected] (N.L.S. da Fonseca).
https://doi.org/10.1016/j.jnca.2020.102596
Received 7 May 2019; Received in revised form 7 February 2020; Accepted 9 March 2020
Available online 17 March 2020
1084-8045/© 2020 Elsevier Ltd. All rights reserved.
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
continuum, which is comprised of end devices, one or more levels of The rest of this paper is structured as follows. Section 2 overviews
Fog nodes, and the Cloud. Fog nodes located at the edge of the net- related work. Section 3 proposes a set of classes of service for Fog com-
work are usually limited in resources (Shao et al., 2019). Still, their puting, and provides a mapping between the recommended classes of
use involves only brief delays in communication while the Cloud has service and the layers of the reference architecture presented by the
a large (“unlimited”) number of resources, but involves long delays in OpenFog Consortium. Section 4 introduces the use of a typical machine
communication. On the lowest level of this continuum, the initial pro- learning-based methodology for Fog computing. Section 5 illustrates
cessing can be carried out, and results passed on to a higher layer in a the implementation of the methodology described in Section 4 with a
Fog hierarchy, or to the Cloud itself for further processing (Byers, 2017; case study, which includes two scenarios, differing by the place where
Arkian et al., 2017). noise is introduced (either in the training set or in the testing set).
Applications are usually composed of (dependent) tasks. The Finally, Section 6 concludes the paper and points out directions for
scheduling of tasks using C2T resources is much more challenging than future research.
that of tasks on grids (confined systems) (Batista and Fonseca, 2010;
Batista and da Fonseca, 2011; Krauter et al., 2002; Xu et al., 2011) 2. Related work
and on Clouds (more homogeneous systems) (Bittencourt et al., 2012;
Fonseca and Boutaba, 2015; Genez et al., 2019; Tsai and Rodrigues, Different studies have analyzed application requirements to develop
2014; Kliazovich et al., 2016; Wu et al., 2015) due to the considerable service models for both Cloud and Fog computing. In Cloud comput-
heterogeneity of both demands of applications and the capacity of the ing, these studies have focused on Service Level Agreements (SLA)
devices. Consequently, there is a need for schedulers to analyze various (Alhamad et al., 2010; Wu et al., 2013; Emeakaroha et al., 2010, 2012)
parameters before making decisions as to where tasks and virtualized and Quality of Experience (QoE) management (Hobfeld et al., 2012),
resources should be run, including consideration of the availability of while in Fog computing, investigations have emphasized processing and
resources and their cost (Byers, 2017). analytics for specific applications (Yang, 2017), scheduling of applica-
It is thus crucial for the efficient provisioning of services that the tions to resources (Cardellini et al., 2015), resource estimation (Aazam
demands of applications arriving at the edge of the network be well et al., 2016) and allocation (Wang et al., 2017; He et al., 2018) for
understood and classified so that resources can be assigned for their pro- the processing of applications, and service placement (Mahmud et al.,
cessing. The mapping of applications onto Class of Services (CoS) should 2019; Skarlat et al., 2017). Next, these studies are briefly described.
facilitate the matching between task requirements and resources, since Alhamad et al. (2010) presented nonfunctional requirements of
labeling these tasks removes the burden of the analysis of the applica- Cloud consumers, and defined the most important criteria for the
tion requirements by the scheduler. Without a precise classification, the definition and negotiation of SLAs between consumers and Cloud ser-
scheduling of application tasks and the allocation of resources can be vice providers.
less than optimal due to the complexity in dealing with the diversity Wu et al. (2013) developed a Software as a Service (SaaS) broker for
of QoS and resource requirements. Mapping applications onto Class of SLA negotiation. The aim was to achieve the required service efficiently
Service is typical in communications network technologies that support when negotiating with multiple providers. The proposal involved the
QoS, such as LTE, 5G (Ali et al., 2017), and ATM (Cohen et al., 1998) design of counter offer strategies and decision-making heuristics which
networks, and it is a key element for network providers to be able to considered time, market constraints and trade-off between QoS param-
offer different grades of service. Moreover, it is essential for the use of eters. Results demonstrated that the proposed approach increases by
efficient classification algorithms to deal with the specific characteris- 50% the profit and by 60% the customer satisfaction level.
tics of C2T. Emeakaroha et al. (2010) presented an approach for mapping SLA
Techniques for the classification of network traffic have been exten- requirements to resource availability, called LoM2Hi, which is capable
sively studied for the past few decades, especially for the provisioning of detecting future SLA violations based on predefined thresholds to
of secure network services (Callado et al., 2009; Finsterbusch et al., avert these violattions.
2014). However, very little attention has been paid to the classifica- Emeakaroha et al. (2012) proposed an architecture for application
tion of the demands and requirements of applications and services over monitoring architecture, named Cloud Application SLA Violation Detec-
the Internet (Zhong et al., 2004). Moreover, device mobility and the tion architecture (CASViD). CASViD monitors and detects SLA viola-
IoT have introduced new applications to the Internet, and these appli- tions at the application layer, and includes tools for resource allocation,
cations have not been considered in any of the classification schemes. scheduling, and deployment. Results showed that the proposed archi-
This paper contributes with the definition of a set of Classes of Service tecture is efficient in monitoring and detecting situations of a single SLA
for Fog computing, that takes into consideration the QoS requirements violation.
of the most relevant Fog applications, thus allowing the differentiation Hobfeld et al. (2012) discussed the challenges for QoE provision-
of the demands of a broad spectrum of applications. ing for Cloud applications with emphasis on multimedia applications.
The original contribution of this paper is a methodology for the clas- The authors also presented a QoE-based classification scheme of Cloud
sification of applications as to Class of Service, which considers their applications aligned to the end-user experience and usage domain.
QoS requirements. This methodology can be used to design effective Yang (2017) investigated common components of IoT systems such
classifiers with an output that can facilitate the job of schedulers of as stream analytics, event monitoring, networked control, and real-time
applications tasks. In the scenario assumed in this paper, users subscribe mobile crowdsourcing, for defining an architecture for Fog data stream-
directly or indirectly to Fog infrastructure services. The first packet of ing.
a flow contains the QoS requirements of the application generating the Cardellini et al. (2015) modified the Storm data stream processing
packet flow. The proposed classifier will then map this application into system (DSP) to operate in a geographically distributed and highly vari-
a CoS using the information provided in the first packet. The CoS can able environment. To demonstrate the effectiveness of the extended
then be used by a Fog task scheduler to schedule application tasks and Storm system, the authors implemented a distributed QoS aware
allocate resource to these tasks. It is our best knowledge that no pre- scheduling algorithm for placing DSP applications near to the data
vious paper has addressed the classification of Fog applications. A case sources and the final consumers. The main limitation of this study is
study dealing with the classification of a dataset containing Fog appli- the instability of the scheduling algorithm that affects negatively the
cation features illustrates the use of this methodology. This methodol- application availability. Results showed that the distributed QoS-aware
ogy constitutes a first step towards the definition of a QoS provisioning scheduler outperforms the default centralized one, improving the appli-
framework to facilitate the definition of new business models in Fog cation performance.
computing.
2
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Aazam et al. (2016) developed a method, called MEdia FOg allocation of resources. A very first step in resource management is to
Resource Estimation (MeFoRE), to provide resource estimation on the separate incoming flow of requests into Classes of Service (CoS) accord-
basis of service give-up ratio, the record of resource usage and the ing to their QoS requirements.
required quality of service. The aim was to avoid resource underuti- Bandwidth. Some applications request a minimally guaranteed
lization and enhance QoS provisioning. MeFoRE methodology uses real throughput, i.e., a Guaranteed Bit Rate (GBR). Multimedia applications
IoT traces and traces of Amazon EC2 service. are bandwidth sensitive, although some of them use adaptive coding
Wang et al. (2017) presented an edge architecture, called mobile techniques to encode digitized voice or video at a rate that matches the
micro-Cloud, to provide situational awareness to processing elements. currently available bandwidth.
The authors introduced an approach for consistent representation of Delay sensitivity. Some applications involve a specific latency thresh-
application requirements for deployment in the mobile micro-Cloud old, below which latency must be assured, especially for real-time appli-
environment. cations.
He et al. (2018) introduced QoE model which included user oriented Loss sensitivity indicates the proportion of packets which does not
metrics such as the Mean Opinion Score (MOS) and content popularity reach their destination.
as well as the cost of cache allocation and transmission rate. The com- Reliability is concerned with the ability of the Fog components to
puted QoE value was used in a resource allocation problem formulated carry out the desired operation in the presence of many types of failure.
as a maximization problem solved by a shortest path tree algorithm. Some applications need to have failed Fog components quickly reestab-
Results showed the benefit of using dynamic allocation to achieve high lished so that tasks can be performed within some latency bounds.
QoE values. Availability provides a measure of how often the resources of the Fog
Mahmud et al. (2019) proposed a QoE-aware application placement are accessible to end-users. High availability is needed by applications
policy that prioritizes placement requests according to user expectation and services that must be running all the time, such as mission-critical
and the Fog available capacity. Two fuzzy logic models were employed applications.
to map applications to resources. Requests for application placement Security refers to the design and implementation of authentication
consider metrics such as service access rate, required resources and and authorization techniques to protect personal and critical informa-
expected processing time. A linear optimization problem ensures that tion generated by end users.
prioritized requests for application placement are mapped to Fog Data location indicates where the application data should be stored.
resources so that user QoE is maximized. Results indicated that the pol- Data can be stored locally, at the end device itself; near, or at a Fog
icy significantly reduces the processing time, resource availability, and node, or in a remote repository, in the Cloud. Requirements of data
the quality of service. location for an application depend on factors such as response time
Skarlat et al. (2017) evaluated the placement of IoT services on Fog constraints, the computational capacity of each Fog layer, and available
resources, taking into account QoS requirements. The authors proposed capacity on network links.
an approach for the optimal sharing of resources among IoT services Mobility is an intrinsic characteristic of many edge devices. Con-
by employing a formal model for Fog systems. The authors introduced tinuity of the offered services should be ensured, even for highly
the Fog Service Placement Problem (FSPP) for placing IoT services on mobile end-users. Continuous connectivity is essential for the process-
virtualized Fog resources while taking into account constraints such as ing needed.
execution time deadlines. Results showed that the proposed optimiza- Scalability is related to the capability of an application to operate
tion model prevents QoS violations and decreases the execution cost efficiently, even in the presence of an increasing number of requests
when compared to a purely Cloud-based solution. from end users. The number of users in a Fog can fluctuate due to
The classification methodology introduced in this paper differs from the mobility of the users, as well as the activation of applications or
the aforementioned proposals by the definition of a set of Class of Ser- sensors. Streams of data in big data processing may need to be pro-
vice for Fog computing and the use of machine learning algorithms to cessed within a specific time frame. The demand on Fog nodes can fluc-
map applications onto these classes. To our knowledge, this is the first tuate and resource elasticity needs to be provided to cope with these
study that introduces a machine learning classification methodology to demands.
discriminate Fog computing applications on the basis of QoS require- The mapping of applications into a set of classes of service is the first
ments. It is crucial for the efficient provisioning of services that the step in the creation of a resource management system capable of coping
demands of applications arriving at the edge of the network be classi- with the heterogeneity of Fog applications. This paper proposes various
fied so that resources can be assigned for their processing. The related classes of service for Fog computing: Mission-critical, Real-time, Inter-
work in Fog computing reported above concentrates on resource allo- active, Conversational, Streaming, CPU-bound, and Best-effort. These
cation and scheduling of applications. Most of the decisions on resource classes will be defined and the typical applications using these classes
allocation and scheduling in those papers is limited to information on identified.
resource consumption by the applications. They do not consider sev- The first CoS to be discussed is the Mission-critical (MC) class. It com-
eral QoS requirements as done in the present manuscript. Moreover, prises applications with a low event to action time-bound, regulatory
no previous paper has addressed the classification of applications on compliance, military-grade security, privacy, and applications in which
the Cloud to Things (C2T) continuum. Most of the work dealing with a component failure would cause a significant increase in the safety
SLAs and QoS/QoE considers only the Cloud. The Fog layers in C2T risk for people and the environment. Applications include healthcare
will increase the capacity of the system to support new applications, and hospital systems, medical localization, healthcare robotics, crimi-
especially those with real-time constraints, which are not possible to be nal justice, drone operations, industrial control, financial transactions,
handled by the Cloud. ATM banking systems, and military and emergency operations.
The Real-time (RT) class, on the other hand, groups applications
3. Classes of Service for Fog applications requiring tight timing constraints in conjunction with effective data
delivery. In this case, the speed of response in real-time applications
Fog computing enables new applications, especially those with strict is critical, since data are processed at the same time they are generated.
latency constraints and those involving mobility. These new applica- In addition to being delay sensitive, real-time applications often require
tions will have heterogeneous QoS requirements and will demand Fog a minimum transmission rate and can tolerate a certain amount of data
management mechanisms to cope efficiently with that heterogeneity. loss. This real-time class includes applications such as online gaming,
Thus, resource management in Fog computing is quite challenging, virtual reality, and augmented reality.
calling for integrated mechanisms capable of dynamically adapting the
3
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
The third class is denominated Interactive (IN). In this case, respon- and command/control of sensors and actuators, while nodes that are
siveness is critical, the time between when the user requests and actions closer to the Cloud aggregate and transform data into knowledge. As
manifested at the client being less than a few seconds. Moreover, users one moves further away from the edge, the overall intelligence and
of interactive applications can be end devices or individuals. Exam- capacity of the system increase.
ples of applications belonging to this class are interactive television, Fig. 1 presents a distributed multi-layer architecture, based on the
web browsing, database retrieval, server access, automatic database OpenFog reference architecture, which is composed of four layers: the
inquiries by tele-machines, pooling for measurement collection, and Cloud, at the top, a layer of end devices at the bottom and two interme-
some IoT deployments. diate Fog layers. Fig. 1 also provides a mapping between the proposed
The fourth class is the Conversational (CO) class. These applications classes of service and a multi-layer Fog-Cloud architecture. The bottom
include some of the video and Voice-over-IP (VoIP). They are charac- layer, composed of IoT and end-devices, sends application requests to
terized by being delay-sensitive but loss-tolerant with delays less than the classifier, located on the first Fog layer. An application request is
150 ms being perceived by humans, delays between 150 and 400 ms composed of the workflow of tasks and their demands, as well as the
can be acceptable, and those exceeding 400 ms resulting in completely QoS requirements for the application. The classifier identifies the CoS of
unintelligible voice conversations. On the other hand, conversational the application and forwards it to the scheduler, which decides where
multimedia applications are loss-tolerant with occasional losses causing the application should be processed, whether on the first Fog layer, on
only occasional glitches in audio or video playback, and these losses can the second Fog layer, or in the Cloud.
often be partially or fully concealed (Kurose and Ross, 2012). Not all layers are involved in the processing of all tasks. Since Real-
The fifth class of service is Streaming (ST), which releases the user to time, Interactive, Conversational, and Streaming applications, such as
download entire files, although in potentially long delays are incurred, online sensing, object hyperlinking, video conferencing, and stored
before playout begins. Streaming applications are accessed by users on streaming are delay-sensitive, these applications must be processed as
demand and must guarantee interactivity and continuous playout to the close as possible to the end user, preferably at nodes located on the
user. For this reason, the most critical performance measure for stream- first and second Fog layer. CPU-bound applications require many pro-
ing video is average throughput (Kurose and Ross, 2012). Additionally, cessing resources, and for this reason, can involve all the layers of the
streaming can refer to stored or live content. In both cases, the net- reference architecture for the processing of tasks. Best-effort applica-
work must provide each flow with an average throughput that is larger tions, such as e-mails, can be processed in the Cloud since there are no
than the content consumption rate. In live transmissions, the delay can delay constraints for this class.
also be an issue, although the timing constraints are much less strin- The possibility of having a hierarchical layered system is one of
gent than those of conversational voice. Thus, delays of up to 10 s or so the significant differences between Fog computing and edge comput-
from when the user chooses to view a live transmission to when playout ing. Edge computing is mainly concerned in bringing the computation
begins can be tolerated. Examples of streaming applications are high- facilities closer to the user; however, in a flat non-hierarchical architec-
definition movies, video (one-way), streaming music, and live radio and ture (Mahmud and Buyya, 2016). A layered architecture can introduce
television transmissions. additional communication overhead for processing tasks at different
The sixth class is CPU-Bound (CB) class which is used by applications layers. However, it has been shown that if the scheduling of tasks and
involving complex processing models, such as those in decision making, resource reservations are properly carried out, processing in a hierar-
which may demand hours, days, or even months of processing. Face chical architecture can reduce communication latency and task waiting
recognition, animation rendering, speech processing, and distributed time for processing when compared to a flat architecture (Chekired et
camera networks, are examples of CPU-Bound applications. al., 2018).
The final class is that of Best-Effort (BE). It is dedicated to traditional In the scenario assumed in this paper, users subscribe directly or
best-effort applications over the Internet. For Best-effort applications, indirectly to Fog infrastructure services. The first packet of a flow con-
long delays are annoying but not particularly harmful; the complete- tains the QoS requirements of the application generating the packet
ness and integrity of the transferred data, however, are of paramount flow. The proposed classifier will then map this application into a CoS
importance. Some examples of the Best-Effort class are e-mail down- using the information provided in the first packet. Alternatively, the
loads, chats, SMS delivery, FTP, P2P file sharing, and M2M communi- first packet could already carry the CoS of the application. However,
cation. such an option would make rigid the CoS adopted by the Fog provider,
Table 1 presents the relationship between the applications sup- preventing the redefinition of this CoS for the handling of new applica-
ported by Fog computing and the requirements of the classes of ser- tions with unique QoS requirements.
vice explained above. The first column shows the recommended priority
level of each class for potential adoption in scheduling systems. 4. Classification methodology based on machine learning
Table 2 shows the range of QoS requirement values for each class of
service: Bandwidth (Kurose and Ross, 2012; Hobfeld et al., 2012), Reli- This section introduces a methodology for choosing and evaluating
ability (Böhmer et al., 2011), Security (Khan et al., 2017), Data storage classifiers for Fog computing applications. It provides a step-by-step
(Alhamad et al., 2010; Hobfeld et al., 2012), Data location (Alhamad et procedure for grounded choices of classifiers. Indeed, this methodology
al., 2010; Hobfeld et al., 2012; Böhmer et al., 2011), Mobility (Böhmer can be easily modified for the classification of applications in networked
et al., 2011), Scalability (Alhamad et al., 2010; Hobfeld et al., 2012; systems.
Böhmer et al., 2011), Delay sensitivity (Byers, 2017; Hobfeld et al., Classification techniques based on ML aim at mapping a set of new
2012), Loss sensitivity (Ali et al., 2013). These ranges are used to gen- input data to a set of discrete or continuous valued output. Fig. 2 sum-
erate the synthetic dataset of Fog applications and are employed for marizes the key steps in the building of a classifier of Fog applications
training and testing samples to evaluate the classifiers in this paper based on ML algorithms. In this paper, the classification steps were exe-
(Section 5). cuted offline. Indeed, the best performing classifier evaluated in these
The reference architecture proposed by the OpenFog consortium in steps can be executed on-line in an operational Fog.
(OpenFog Reference Architecture, 2017) provides a structural model The first step is the creation of a labeled dataset containing QoS
for Fog-based computation on several tiers of nodes. The tiers differ attributes of Fog applications, which can be either real or synthetic. A
in relation to the amount and type of work which can be processed real dataset is one collected from a system in operation while a synthetic
on them, the number of sensors, the capacity of the nodes, the latency one involves data collection generated by models. Real-world datasets
between nodes, reliability, and availability of nodes. Nodes at the edge usually contain sensitive data (McGregor et al., 2004) and are often
are involved in sensor data acquisition/collection, data normalization, unavailable for the maintenance of the user information. Thus, the use
4
J.C. Guevara et al.
Table 1
Class of Service and their requirements.
Allocation Class of Service Quality Requirements Applications
Priority Service
Bandwidth Reliability Security Data storage Data location Mobility Scalability Delay Loss
sensitivity sensitivity
Low Important Critical Low Medium High Transient Short Long Local Vicinity Remote Low Medium High Low Medium High
duration duration
1 MC GBR ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Yes Yes Healthcare, criminal
justice, financial,
biological traits,
residence and
geographic, military,
emergency.
2 RT GBR ✓ ✓ ✓ ✓ ✓ ✓ Yes No Online gaming, IoT
deployments,
industrial control,
virtual and
augmented reality,
interactive television,
telemetry.
3 IN GBR ✓ ✓ ✓ ✓ ✓ ✓ Yes Yes Interactive television,
object hyperlinking,
web browsing,
database retrieval,
server access,
5
automatic database
enquiries by
tele-machines,
pooling for
measurements
collection, and some
IoT deployments.
4 CO GBR ✓ ✓ ✓ ✓ ✓ ✓ Yes No Voice messaging,
VoIP,
Table 2
Intervals of the QoS requirements for Fog computing.
QoS Requirements Nominal Categories Intervals Class of Service
MC RT IN CO ST CB BE
Bandwidth (Mbps) Low 0 < x ⩽ 1 ✓ ✓ ✓ ✓
Medium 1 < x ⩽ 5 ✓ ✓
High 5 < x ⩽ 1000 ✓
Reliability Low x = 1 ✓ ✓
Important x = 2 ✓ ✓ ✓ ✓
Critical x = 3 ✓
Security Low x = 1 ✓
Medium x = 2 ✓ ✓ ✓
High x = 3 ✓ ✓ ✓
Data storage (h) Transient 0 < x ⩽ 1 ✓ ✓ ✓ ✓
Short duration 1 < x ⩽ 730 ✓
Long duration 730 < x ⩽ 8760 ✓ ✓ ✓
Data location (ms) Local 0 < x ⩽ 10 ✓ ✓
Vicinity 10 < x ⩽ 20 ✓ ✓
Remote 20 < x ⩽ 100 ✓ ✓ ✓
Mobility (Km/h) Low 0 < x ⩽ 5 ✓ ✓ ✓ ✓ ✓
Medium 5 < x ⩽ 25 ✓ ✓
High 25 < x ⩽ 100 ✓ ✓ ✓
Scalability (No. of IoT users/end users) Low 0 < x ⩽ 60 ✓
Medium 60 < x ⩽ 120 ✓ ✓
High 120 < x ⩽ 200 ✓ ✓ ✓ ✓
Delay sensitivity (Interaction latency in ms) Low 1000 < x ⩽ 100000 ✓ ✓ ✓
Moderate 10 < x ⩽ 1000 ✓ ✓
High 0 < x ⩽ 10 ✓ ✓
Loss sensitivity (PELR) Low 10−3 < x ⩽ 10−2 ✓
Moderate 10−6 < x ⩽ 10−3 ✓ ✓
High 0 < x ⩽ 10−6 ✓ ✓ ✓ ✓
6
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
of synthetic data sets is quite common, especially in the studies of sys- 5. Classification of Fog applications
tems yet to be built.
Since the value of QoS attributes differs widely, these values should This section illustrates step by step the methodology presented in
be pre-processed to produce compatible ranges of values for classifica- the previous section for the classification of Fog applications using the
tion. Pre-processing includes operations for data transformation, which CoS presented previously. Moreover, an example of a Decision Tree
standardize and consolidate data into more appropriate forms for clas- that classifies Fog computing applications from the values of their QoS
sification, while data reduction includes the selection and extraction of requirements is provided at the end of this section.
both features and examples in a database (García et al., 2015; Tan et al.,
2005). Data normalization avoids the handling of an attribute which has 5.1. Labeled dataset
large values that dominate the results of the classification, thus improv-
ing the predictive power of the model. Feature selection, on the other To train and test the classifiers employed in this paper, we built
hand, removes redundant and irrelevant data from the input of the clas- a dataset 1 composed of 14,000 mutually exclusive applications gen-
sifier, without compromising critical attribute information (Tan et al., erated from data in the intervals of values acceptable for each QoS
2005). requirement of the application. 90% of the data were reserved for train-
Noise is an unavoidable problem in collecting data from real-world ing, while the remaining 10% were used for testing. It was assumed
systems. It can change the knowledge extracted from the data set and that each incoming application had additional fields containing nine
affects the accuracy of the classification, building time, size, and inter- QoS requirements, from now on referred to as “attributes”: Bandwidth,
pretability of the classifier (Zhong et al., 2004; Zhu and Wu, 2004). Reliability, Security, Data Storage, Data location, Mobility, Scalability,
Common sources of noise are channel capacity fluctuation, fluctuation Delay sensitivity, and Loss sensitivity.
in the availability of computational resources, imprecision inherited Attribute values were assigned by employing a uniform probabil-
from measurement tools, and the inability to accurately estimate the ity distribution, within the intervals specified for each CoS in Table 2.
true demands of applications. An independent random number generator randomly created the val-
In such noisy scenarios, robustness is considered more important ues of each attribute. Transient data were removed according to the
than performance because robustness allows a priori knowledge of the Moving Average of Independent Replications procedure (Jain et al.,
behavior expected from a learning method despite noise when it is 1991). Attribute values were made up of safe and borderline examples.
unknown (García et al., 2015). Robustness (Huber, 1981) is defined Safe examples were placed in relatively homogeneous areas concerning
as the capability of an algorithm to be insensitive to data corruption the class label. Borderline examples, on the other hand, are located in
and, consequently, more resilient against the impact of noise. A copy of the area surrounding class boundaries, where different classes overlap.
the original dataset should be contaminated by the introduction of noise Also, to estimate the robustness of the classifiers, the third group of
at different levels to check the robustness of a classifier. In this paper, attribute values, called noisy examples, was generated. The term noisy
uniform attribute noise levels of 10%, 30%, and 50% are employed. sample will be used in this paper to refer to the samples generated to
The performance of the classifiers which learned from the original data represent the corruption of their attribute values.
set is compared to that of those which learned from a noisy data set. Fig. 3 illustrates the safe samples, labeled as S, the borderline exam-
The most robust classifiers are those which learned from noisy data sets ples, labeled as B, and thenoisy samples, labeled as N. The continuous
yet produced results similar to those learned from a noise-free data set line shows the decision boundary between the two classes.
(García et al., 2015).
Classification techniques based on ML can then be applied. The per- 5.2. Pre-processing
formance of the classifier should be assessed by a performance evaluation
process, which encompasses both the measurement of performance and Z-score normalization was used to adjust attribute values defined on
the result of statistical tests. The adequacy of performance is usually a different scale. Mean and standard deviation were computed on the
assessed by metrics such as accuracy, efficiency, and robustness. Statis- training set, and then, the same mean and standard deviation were then
tical testing gathers evidence of the extent to which an evaluation met- used to normalize the testing set.
ric on the resampled data sets is representative of the general behavior We reduced the number of input attributes to be used by classi-
of the classifier (Naqa et al., 2015). fication algorithms. This process, known as dimensionality reduction,
At this point, the classification model is ready to receive new input
for scoring. The new data, however, must also be subjected to a pre-
1
processing process. publicly available at http://bit.ly/34x6X1O.
7
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
removes irrelevant, redundant, and noisy information from the data, each attribute to the two principal components in the plot. For instance,
often leading to enhanced performance in learning and classification Fig. 6 shows the coefficients of each attribute concerning the first two
tasks (Roffo, 2016). Two techniques can be used for dimensionality principal components.
reduction: one, by using feature selection techniques such as Relief-F The other five principal components were also plotted in bi-plots.
(Liu and Motoda, 2007), CFS (Guyon et al., 2002), MCFS (Cai et al., Table 3 shows the contribution of the attributes to each principal com-
2010), and the Student’s t-test, which rank the given feature set so that ponent.
the least significant features can be removed from the problem. The Interpretation of the principal components is based on finding which
second way involves feature extraction techniques, such as Principal variables are most strongly correlated with each component, that is,
Components Analysis (PCA), which creates new features from the given which of these numbers is large, the farthest from zero in either direc-
feature set. The resulting number of the features is less than that the tion. The decision as to what values should be considered large is a
initial set of features. subjective one and reflects knowledge of the system under evaluation. It
In this paper, the significance level of the impact of each input was determined that a correlation value was relevant to our study when
attribute on the system is determined utilizing a PCA, guided further it was above 0.48, since this value is the largest within each principal
by a correlation analysis and the semantics of the Fog computing envi- component, and most of the variables having this high value are highly
ronment. The correlation analysis and complementarity with the PCA correlated, as shown by the components of the correlation matrix. These
are explained below. large correlation values are in boldface in Table 3.
The correlation analysis is a statistical evaluation technique used The principal component results can be interpreted with respect
to study the strength of a dependence between pairs of attributes. to the value deemed to be significant. The first principal component
Fig. 4 provides a graphic representation of the correlation matrix among correlated strongly with three of the original attributes. Thus, the first
dataset attributes for Fog applications. principal component increases with increasing Data storage, Data loca-
The correlation matrix shows that there is a statistical association tion, and Delay sensitivity, suggesting that these three attributes vary
of more than 50% between the following variables: “Data storage” together.
and “Data location” (0.623), “Data storage” and “Delay sensitivity” On the other hand, the coefficients belonging to the second prin-
(0.596), “Loss sensitivity” and “Mobility” (0.563), “Bandwidth” and cipal component show that the behavior of the feature bandwidth
“Scalability” (−0,605) and, “Data location” and “Delay sensitivity” opposes that of the behavior of the features Mobility and Scalability.
(0.674). The symbol “-”in the correlation value between the attributes This reflects the fact that greater mobility is associated with changes in
of “Bandwidth”and “Scalability”indicates an inverse relationship the network topology, which, in turn, increases the fluctuations in com-
between the two. munication links and reduces the bandwidth availability. Moreover,
Principal components analysis (PCA) (Pearson, 1901) is a com- since bandwidth is a finite resource, if the number of users connected
mon approach for dimensionality reduction that uses techniques from to the Fog increases the rate at which each user transmits and receives
linear algebra to find new attributes, denominated principal compo- data decreases.
nents, which are linear combinations of the original attributes. They are Other features that reveal an inverse relationship are Data Storage
orthogonal to each other, and capture the maximum amount of varia- and Loss sensitivity, in the fifth principal component, and mobility and
tion in the data. Fig. 5 shows the scree plot of the percent variability loss sensitivity in the seventh principal component.
explained by each principal component. Finally, the third, fourth, and sixth principal components increase
As illustrated by Fig. 5, the first seven principal components explain with only one of the values, that is, there is only one variable with
94.314% of the total variance. The first component by itself explained a value 0.48 or higher. These variables are Security, Reliability, and
less than 35% of the variance, so more components might be needed. Delay sensitivity, respectively. Accordingly, the third, fourth, and sixth
Also, Fig. 5 reveals that the first three principal components explain principal components can be interpreted as measures of how necessary
roughly two-thirds of the total variability in the standardized ratings. the use of isolated nodes is to process the application, how quickly
In addition to the percent variability explained by each principal failed Fog components should be reestablished, and how sensitive the
component, all nine attributes were represented in a bi-plot by a vec- Fog application is to the delay.
tor. The direction and length of the vector indicate the contribution of
8
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Fig. 5. Scree plot of the percent variability explained by each principal component.
Fig. 6. Orthonormal principal component coefficients for each variable and principal component scored for each observation (principal components 1 and 2).
Table 3
Attribute coefficients for each principal component.
Attribute Principal Component
1 2 3 4 5 6 7
Bandwidth −0.161 −0.524 −0.449 −0.065 0.079 0.118 0.114
Reliability −0.312 −0.001 0.040 0.702 0.313 0.295 −0.318
Security −0.297 −0.085 0.595 0.283 −0.342 −0.286 −0.021
Data Storage 0.486 0.062 −0.099 0.181 0.490 −0.375 −0.389
Data location 0.497 0.075 0.022 0.213 −0.351 −0.283 0.062
Mobility −0.169 0.491 −0.327 0.350 0.117 −0.207 0.645
Scalability −0.020 0.530 0.337 −0.366 0.309 0.318 −0.014
Delay sensitivity 0.480 0.016 0.016 0.295 −0.273 0.673 0.134
Loss sensitivity −0.213 0.432 −0.460 −0.044 −0.481 0.011 −0.544
Based on the analysis described above and considering the semantics tion, Data storage, and Mobility have been removed. Thus, six of the
of the case study about which variables deserve more attention into the original nine attributes were selected and maintained for the stage of
Fog computing environment, redundant attributes such as Data loca- classification. These were Delay sensitivity, Scalability, Loss sensitivity,
9
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
10
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Table 5
Test accuracy and RLA results of classifiers trained with noisy datasets.
Noise Level (%) ANFIS DT ANN(1) ANN(2) ANN(3) KNN SVM
Test accuracy results 0 99.271 99.986 100.000 100.000 100.000 100.000 100.000
10 94.429 99.986 99.911 99.950 99.921 99.800 99.997
30 84.900 99.950 99.676 99.800 99.691 99.029 99.836
50 78.564 99.979 90.807 91.370 91.419 99.193 99.600
Acc0% − Accx%
RLAx% = , (1)
Acc0%
5.4.1. Classification using a training set with attribute noise and a clean
testing set
Table 5 shows the average performance and robustness results for
each classification algorithm at each noise level, from 0% to 50%, on Fig. 7. Accuracy rates concerning the testing time for classifiers trained with
training datasets with uniform attribute noise. both clean and noisy datasets.
As can be observed in Table 5, the DT is the most robust classifier for
all noise levels. On the other hand, the ANN(1), ANN(2), and ANN(3)
present high robustness for noise levels (10–30%). Conversely, the RLA Table 6 shows the effect of noise data sets in training. The per-
of classifiers based on neural networks rises linearly to 9% when the formance of ANFIS is the least accurate. ANN(1), ANN(2), and SVM
noise level is 50%. The least robust classifier is the ANFIS, for which produce a similar performance while the DT outperform most of the
the loss of accuracy increases exponentially as the proportion of noise results of the rest of classifiers when it is trained using datasets with an
level rises, to the point that when the noise level is 50%, its RLA is attribute noise level equal to or greater than 30%.
above 21% of that a clean dataset.
Fig. 7 shows the accuracy ratio and testing time when training takes 5.4.2. Classification using a clean training set and a testing set with
place with both clean datasets, and those disrupted by uniform attribute attribute noise
noise levels of 10%, 30%, and 50%. A marker identifies each classifi- Table 7 shows the results for average performance and robustness
cation algorithm, and a different color identifies each noise level. The for each classification algorithm at each noise level, from 0% to 50%,
light-bands indicate the areas of the greatest accuracy or the slowest from the testing of datasets with uniform attribute noise.
testing times, and the light-purple intersection of these bands indicates As evinced in Table 7, for all classifiers, accuracy decreases exponen-
the area where the best results for both accuracy and testing time are tially with an increase in the noise level of the testing dataset. In this
found. situation, the most robust classifiers are ANN(2), DT, ANN(3), ANN(1),
The DT algorithm takes only 25 ms for classification with the great- and KNN.
est accuracy for up to 1400 applications simultaneously arriving at Fig. 8 illustrates the results of the classification algorithms when
the edge, when training has taken place using datasets with a uniform both accuracy and testing time are considered with both clean and noise
attribute noise level of 50%. datasets (levels of 10%, 30%, and 50%). A marker identifies each clas-
Table 6 presents the results of a two-tailed Wilcoxon signed-rank sification algorithm, and a different color identifies each noise level.
test (considering a significance level of 0.05) to verify statistical dif- The light-blue bands indicate the areas of the greatest accuracy rates
ferences between the accuracy results of classifiers trained with noisy or the slowest testing times, and the light-purple intersection of these
datasets. Each cell shows the results of the statistical tests between a bands indicates the area where the best results were obtained when
single classifier with the others for the four levels of noise (nl). Left ‘←’ considering both accuracy and testing time.
and up ‘↑’ arrows indicate the most accurate, while an empty cell refers The DT algorithm takes less than 30 ms for classification for with
to “no statistical difference between the pairs of classifiers” in that row the greatest level of accuracy, up to 1400 applications simultaneously
and column. arriving at the edge, when the input is a noise dataset as long as the
11
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Table 6
Statistical test for the accuracy of classifiers trained with noisy datasets. nl denotes the percentage of noise
level present in the training dataset.
nl ANFIS DT ANN(1) ANN(2) ANN(3) KNN SVM
ANFIS 0 – ↑ ↑ ↑ ↑ ↑ ↑
10 – ↑ ↑ ↑ ↑ ↑ ↑
30 – ↑ ↑ ↑ ↑ ↑ ↑
50 – ↑ – – – ↑ ↑
DT 0 ← – – – – – –
10 ← – – – – ← –
30 ← – ← ← ← ← ←
50 ← – ← ← ← ← ←
ANN(1) 0 ← – – – – – –
10 ← – – – – – –
30 ← ↑ – – – ← –
50 – ↑ – – – – –
ANN(2) 0 ← – – – – – –
10 ← – – – – ← –
30 ← ↑ – – – ← –
50 – ↑ – – – – –
ANN(3) 0 ← – – – – – –
10 ← – – – – – ↑
30 ← ↑ – – – ← ↑
50 – ↑ – – – – –
KNN 0 ← – – – – – –
10 ← ↑ – ↑ – – ↑
30 ← ↑ ↑ ↑ ↑ – ↑
50 ← ↑ – – – – ↑
SVM 0 ← – – – – – –
10 ← – – – ← ← –
30 ← ↑ – – ← ← –
50 ← ↑ – – – ← –
Table 7
Test accuracy and RLA results of classifiers tested with noisy datasets.
Noise Level (%) ANFIS DT ANN(1) ANN(2) ANN(3) KNN SVM
Test accuracy results 0 99.271 99.986 100.000 100.000 100.000 100.000 100.000
10 72.214 85.693 83.570 86.131 84.031 84.093 81.79
30 41.343 63.114 59.464 63.860 60.467 59.971 55.403
50 27.414 47.264 45.381 48.564 46.176 44.764 40.799
noise level does not exceed 10%. reliability takes on a “critical” value. Therefore, assessing certain fea-
Table 8 presents the results of a two-tailed Wilcoxon signed-rank tures makes classification a more efficient process. This is an attrac-
tests (considering a significance level of 0.05) to verify the statistical tive characteristic which makes Decision Tree an ideal algorithm for
differences between the accuracy of the different classifiers when tested the classification of applications in Fog computing. Moreover, the Deci-
with noisy datasets. Each cell shows the result of the statistical test sion Tree algorithm is easy to interpret, fast for fitting and prediction,
between the pairs of classifiers with different percentages of noise (nl). and does not use much memory. Given these characteristics, the Deci-
Left ‘←’and up ‘↑’arrows indicate the greatest accuracy while an empty sion Tree algorithm can be run by devices such as routers, switches,
cell refers to “no statistical difference between the pair of classifiers” in and servers, located on the first Fog layer of the reference architecture
that row and column. introduced by the OpenFog Consortium in (OpenFog Reference Archi-
The effect of noisy data set in testing is shown in Table 8. The per- tecture, 2017). After classification, the output of the classifier serves as
formance of ANFIS is the least accurate. Moreover, the DT produces the input for the scheduler, also located at the first Fog layer, which decides
most accurate classification results in classifications independent of the where the application should be processed.
presence of noise.
6. Conclusions
5.5. Classification model
This paper has introduced the use of ML classification algorithms
The final step of the proposed methodology is the selection of a as a tool for QoS-aware resource management in Fog computing. First,
classifier. The results indicate that the DT was the most accurate and potential Fog computing applications are grouped in seven CoS accord-
robust classifier. ing to their QoS requirements. A synthetic database of Fog applications
The decision tree algorithm does not need to assess all the attributes is built from the definition of the intervals that each QoS requirement
to classify an application since various services have exclusive features. relevant for a specific Class. At this point, the dataset is pre-processed
For example, mission-critical applications are the only ones for which to convert prior useless data into new data that can be used by ML
12
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Table 8
Statistical test for the accuracy of classifiers tested with noisy datasets. nl denotes the percentage of noise
level present in the testing dataset.
nl ANFIS DT ANN(1) ANN(2) ANN(3) KNN SVM
ANFIS 0 – ↑ ↑ ↑ ↑ ↑ ↑
10 – ↑ ↑ ↑ ↑ ↑ ↑
30 – ↑ ↑ ↑ ↑ ↑ ↑
50 – ↑ ↑ ↑ ↑ ↑ ↑
DT 0 ← – – – – – –
10 ← – ← – ← ← ←
30 ← – ← – ← ← ←
50 ← – – – – ← ←
ANN(1) 0 ← – – – – – –
10 ← ↑ – ↑ – – ←
30 ← ↑ – ↑ – – ←
50 ← – – ↑ – – ←
ANN(2) 0 ← – – – – – –
10 ← – ← – ← ← ←
30 ← – ← – ← ← ←
50 ← – ← – – ← ←
ANN(3) 0 ← – – – – – –
10 ← ↑ – ↑ – – ←
30 ← ↑ – ↑ – – ←
50 ← – – – – – ←
KNN 0 ← – – – – – –
10 ← ↑ – ↑ – – ←
30 ← ↑ – ↑ – – ←
50 ← ↑ – ↑ – – ←
SVM 0 ← – – – – – –
10 ← ↑ ↑ ↑ ↑ ↑ –
30 ← ↑ ↑ ↑ ↑ ↑ –
50 ← ↑ ↑ ↑ ↑ ↑ –
Author-contribution
Fig. 8. Accuracy rates concerning the testing time for classifiers tested with
both clean and noisy datasets. Judy Guevara – the manuscript is part of her Ph.D. thesis. She pro-
grammed, generated the data, co-wrote and revised the manuscript.
Ricardo Torres – advised on machine learning techniques, co-wrote
techniques. Next, a set of popular ML algorithms is selected and puts and revised the manuscript. Nelson Fonseca – thesis supervisor, advised
through the training and testing processes, using the examples in the on Fog and networking content, co-wrote and revised the manuscript.
synthetic database to measure the degree of accuracy and efficiency
in their prediction of the CoS to which the application belongs. For Declaration of competing interest
this, the synthetic database is contaminated with three different levels
of attribute noise. For each noise level, the classifier conducts training None.
and testing to measure the degree of robustness.
13
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Acknowledgments Emeakaroha, V.C., Brandic, I., Maurer, M., Dustdar, S., 2010. Low level metrics to high
level slas - lom2his framework: bridging the gap between monitored metrics and sla
parameters in cloud environments. In: 2010 International Conference on High
This work was supported by the Brazilian National Research Agency Performance Computing Simulation, pp. 48–54, https://doi.org/10.1109/HPCS.
CNPq and the Academy of Sciences for the Developing World (TWAS), 2010.5547150.
under process 190172/2014-2 of the CNPq-TWAS program. The authors Emeakaroha, V.C., Ferreto, T.C., Netto, M.A.S., Brandic, I., De Rose, C.A.F., 2012.
Casvid: application level monitoring for sla violation detection in clouds. In: 2012
are also grateful to CAPES (grant No. 88881.145912/2017–01), CNPq IEEE 36th Annual Computer Software and Applications Conference, pp. 499–508,
(grant No. 307560/2016-3), FAPESP (grants Nos. 2014/12236-1, https://doi.org/10.1109/COMPSAC.2012.68.
2015/24494-8, 2016/50250-1, and 2017/20945-0) and the FAPESP- Finsterbusch, M., Richter, C., Rocha, E., Muller, J., Hanssgen, K., 2014. A survey of
payload-based traffic classification approaches. IEEE Commun. Surv. Tutor. 16 (2),
Microsoft Virtual Institute (grants Nos. 2013/50155-0, 2013/50169-1, 1135–1156.
and 2014/50715-9) The authors would like to thank reviewers con- Fonseca, N.L. S.d., Boutaba, R., 2015. Cloud Services, Networking, and Management.
structive comments. John Wiley & Sons.
Garca, S., Luengo, J., Herrera, F., 2015. Dealing with noisy data. In: Data Preprocessing
in Data Mining. Springer International Publishing, Cham, pp. 107–145.
References Genez, T.A.L., Bittencourt, L.F., Fonseca, N.L. S.d., Madeira, E.R.M., 2019. Estimation of
the available bandwidth in inter-cloud links for task scheduling in hybrid clouds.
Aazam, M., St-Hilaire, M., Lung, C., Lambadaris, I., 2016. Mefore: qoe based resource IEEE Trans. Cloud Comput. 7 (1), 62–74.
estimation at fog to enhance qos in iot. In: 2016 23rd International Conference on Guyon, I., Weston, J., Barnhill, S., Vapnik, V., 2002. Gene selection for cancer
Telecommunications (ICT), pp. 1–5, https://doi.org/10.1109/ICT.2016.7500362. classification using support vector machines. Mach. Learn. 46 (1), 389–422.
Alhamad, M., Dillon, T., Chang, E., 2010. Conceptual sla framework for cloud He, X., Wang, K., Huang, H., Miyazaki, T., Wang, Y., Sun, Y., 2018. Qoe-driven joint
computing. In: 4th IEEE International Conference on Digital Ecosystems and resource allocation for content delivery in fog computing environment. In: 2018
Technologies, pp. 606–610, https://doi.org/10.1109/DEST.2010.5610586. IEEE International Conference on Communications (ICC), pp. 1–6, https://doi.org/
Ali, N.A., Taha, A.M., Hassanein, H.S., 2013. Quality of service in 3gpp r12 10.1109/ICC.2018.8422843.
lte-advanced. IEEE Commun. Mag. 51 (8), 103–109, https://doi.org/10.1109/ Hobfeld, T., Schatz, R., Varela, M., Timmerer, C., 2012. Challenges of qoe management
MCOM.2013.6576346. for cloud applications. IEEE Commun. Mag. 50 (4), 28–36, https://doi.org/10.1109/
Ali, M.A., Esmailpour, A., Nasser, N., 2017. Traffic density based adaptive QoS classes MCOM.2012.6178831.
mapping for integrated LTE-WiMAX 5G networks. In: 2017 IEEE International Hu, P., Dhelim, S., Ning, H., Qiu, T., 2017. Survey on fog computing: architecture, key
Conference on Communications (ICC), pp. 1–7. technologies, applications and open issues. J. Netw. Comput. Appl. 98, 27–42.
Alkassab, N., Huang, C.T., Chen, Y., Choi, B.Y., Song, S., 2017. Benefits and schemes of Huber, P., 1981. J. Wiley, W. InterScience, Robust Statistics. Wiley, New York.
prefetching from cloud to fog networks. In: 2017 IEEE 6th International Conference Sandler, M., Howard, A., Zhu A., Zhmoginov, A., Chen, L.-C., 2018. MobileNetV2:
on Cloud Networking (CloudNet), pp. 1–5. Inverted Residuals and Linear Bottlenecks, arXiv:1801.04381 [cs]ArXiv:
Arkian, H.R., Diyanat, A., Pourkhalili, A., 2017. MIST: fog-based data analytics scheme 1801.04381.
with cost-efficient resource provisioning for IoT crowdsensing applications. J. Netw. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K., 2016.
Comput. Appl. 82, 152–165. SqueezeNet: AlexNet-Level Accuracy with 50x Fewer Parameters and <0.5mb Model
Baggio, G., Bassoli, R., Granelli, F., 2019. Cognitive software-defined networking using Size, arXiv:1602.07360 [cs]ArXiv: 1602.07360.
fuzzy cognitive maps. IEEE Trans. Cogn. Comm. Network. 5 (3), 517–539, https:// Howard, A. G, Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto,
doi.org/10.1109/TCCN.2019.2920593. M., Adam, H., 2017. MobileNets: Efficient Convolutional Neural Networks for
Batista, D.M., Fonseca, N.L. S.d., 2010. A survey of self-adaptive grids. IEEE Commun. Mobile Vision Applications, arXiv:1704.04861 [cs]ArXiv: 1704.04861.
Mag. 48 (7), 94–100. Jain, R., Jain, R.K., Jain, 1991. The Art of Computer Systems Performance Analysis:
Batista, D.M., da Fonseca, N.L.S., 2011. Robust scheduler for grid networks under Techniques for Experimental Design, Measurement, Simulation, and Modeling, Edio,
uncertainties of both application demands and resource availability. Comput. 1 Edition. Wiley, New York.
Network. 55 (1), 3–19. Khan, S., Parkinson, S., Qin, Y., 2017. Fog computing security: a review of current
Batista, D.M., Fonseca, N.L. S.d., Granelli, F., Kliazovich, D., 2007. Self-adjusting grid applications and security solutions. J. Cloud Comput. 6 (1), 19, https://doi.org/10.
networks. In: 2007 IEEE International Conference on Communications, pp. 1–5. 1186/s13677-017-0090-3.
Batista, D.M., da Fonseca, N.L.S., Miyazawa, F.K., Granelli, F., 2008. Self-adjustment of Kliazovich, D., Pecero, J.E., Tchernykh, A., Bouvry, P., Khan, S.U., Zomaya, A.Y., 2016.
resource allocation for grid applications. Comput. Network. 52 (9), 1762–1781. CA-DAG: Modeling communication-aware applications for scheduling in cloud
Bittencourt, L.F., Madeira, E.R.M., Fonseca, N.L.S.D., 2012. Scheduling in hybrid clouds. computing. J. Grid Comput. 14 (1), 23–39, https://doi.org/10.1007/s10723-015-
IEEE Commun. Mag. 50 (9), 42–47. 9337-8.
Bittencourt, L.F., Diaz-Montes, J., Buyya, R., Rana, O.F., Parashar, M., 2017. Kohavi, R., 1995. A study of cross-validation and bootstrap for accuracy estimation and
Mobility-aware application scheduling in fog computing. IEEE Cloud Comput. 4 (2), model selection. In: Proceedings of the 14th International Joint Conference on
26–35. Artificial Intelligence, vol. 2. Morgan Kaufmann Publishers Inc., San Francisco, CA,
Bhmer, M., Hecht, B., Schning, J., Krger, A., Bauer, G., 2011. Falling asleep with angry USA, pp. 1137–1143. IJCAI95.
birds, facebook and kindle: a large scale study on mobile application usage. In: Krauter, K., Buyya, R., Maheswaran, M., 2002. A taxonomy and survey of grid resource
Proceedings of the 13th International Conference on Human Computer Interaction management systems for distributed computing. Software Pract. Ex. 32 (2),
with Mobile Devices and Services, MobileHCI 11. ACM, New York, NY, USA, pp. 135–164, https://doi.org/10.1002/spe.432.
47–56, https://doi.org/10.1145/2037373.2037383. Kumari, A., Tanwar, S., Tyagi, S., Kumar, N., Parizi, R.M., Choo, K.-K.R., 2019. Fog data
Bonomi, F., Milito, R., Zhu, J., Addepalli, S., 2012. Fog computing and its role in the analytics: a taxonomy and process model. J. Netw. Comput. Appl. 128, 90–104.
Internet of Things. In: Proceedings of the First Edition of the MCC Workshop on Kurose, J.F., Ross, K.W., 2012. Computer Networking: A Top-Down Approach, sixth ed.
Mobile Cloud Computing, MCC 12. ACM, New York, NY, USA, pp. 13–16. Pearson. sixth ed..
Byers, C.C., 2017. Architectural imperatives for fog computing: use cases, requirements, Liu, H., Motoda, H., 2007. In: Computational Methods of Feature Selection (Chapman &
and architectural techniques for fog-enabled IoT networks. IEEE Commun. Mag. 55 Hall/Crc Data Mining and Knowledge Discovery Series). Chapman & Hall/CRC.
(8), 14–20, https://doi.org/10.1109/MCOM.2017.1600885. Mahmud, R., Srirama, S.N., Ramamohanarao, K., Buyya, R., 2019. Quality of experience
Cai, D., Zhang, C., He, X., 2010. Unsupervised feature selection for multi-cluster data. In: (qoe)-aware placement of applications in fog computing environments. J. Parallel
Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Distr. Comput. 132, 190–203, https://doi.org/10.1016/j.jpdc.2018.03.004, http://
Discovery and Data Mining, KDD 10. ACM, New York, NY, USA, pp. 333–342. www.sciencedirect.com/science/article/pii/S0743731518301771.
Callado, A., Kamienski, C., Szabo, G., Gero, B.P., Kelner, J., Fernandes, S., Sadok, D., McGregor, A., Hall, M., Lorier, P., Brunskill, J., 2004. Flow clustering using machine
2009. A survey on Internet traffic identification. IEEE Commun. Surv. Tutor. 11 (3), learning techniques. In: Barakat, C., Pratt, I. (Eds.), Passive and Active Network
37–52. Measurement, Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp.
Cardellini, V., Grassi, V., Presti, F.L., Nardelli, M., 2015. On qos-aware scheduling of 205–214.
data stream applications over fog computing infrastructures. In: 2015 IEEE Mell, P., Grance, T., Sep. 2011. The NIST Definition of Cloud Computing. Tech. Rep.
Symposium on Computers and Communication. ISCC, pp. 271–276, https://doi.org/ NIST Special Publication (SP) 800-145. National Institute of Standards and
10.1109/ISCC.2015.7405527. Technology.
Chekired, D.A., Khoukhi, L., Mouftah, H.T., 2018. Industrial iot data scheduling based Merchant, K., Revay, S., Stantchev, G., Nousain, B., 2018. Deep learning for rf device
on hierarchical fog computing: a key for enabling smart factory. IEEE Trans. Industr. fingerprinting in cognitive communication networks. IEEE J. Select. Top. Sign. Proc.
Inform. 14 (10), 4590–4602, https://doi.org/10.1109/TII.2018.2843802. 12 (1), 160–167, https://doi.org/10.1109/JSTSP.2018.2796446.
Choose Classifier Options - MATLAB & Simulink, 2018. Available: https://www. Mesodiakaki, A., Adelantado, F., Alonso, L., Verikoukis, C.V., 2014. Energy-efficient user
mathworks.com/help/stats/choose-a-classifier.html. (Accessed 21 May 2018). association in cognitive heterogeneous networks. IEEE Commun. Mag. 52 (7),
Cohen, R., Fonseca, N.L.S., Zukerman, M., 1998. Traffic Management and Control, 22–29, https://doi.org/10.1109/MCOM.2014.6852079.
Multimedia Communication Network: Technologies and Services. Artech House, Naqa, I.E., Li, R., Murphy, M.J. (Eds.), 2015. Machine Learning in Radiation Oncology:
Norwood, MA, USA. Theory and Applications. Springer International Publishing.
Deng, R., Lu, R., Lai, C., Luan, T.H., Liang, H., 2016. Optimal workload allocation in OpenFog Reference Architecture: OpenFog Consortium, 2017. Available: https://www.
fog-cloud computing toward balanced delay and power consumption. IEEE Intern. openfogconsortium.org/ra/. (Accessed 24 May 2017).
Things J. 3 (6), 1171–1181. Pearson, K., 1901. On lines and planes of closest fit to systems of points in space. Phil.
Mag. 2 (11), 559–572.
14
J.C. Guevara et al. Journal of Network and Computer Applications 159 (2020) 102596
Mahmud, R., Buyya, R., 2016. Fog computing: A taxonomy, survey and future directions, vation, COLCIENCIAS; and the CNPq-TWAS Postgraduate Fel-
CoRR abs/1611.05539. arXiv:1611.05539. URL http://arxiv.org/abs/1611.05539. lowship (2014). Her research interests focus on resource man-
Roffo, G., 2016. Feature Selection Library (MATLAB Toolbox), arXiv:1607.01327 agement and scheduling in Cloud and Fog computing net-
[cs]ArXiv: 1607.01327. works.
Shao, Y., Li, C., Fu, Z., Jia, L., Luo, Y., 2019. Cost-effective replication management and
scheduling in edge computing. J. Netw. Comput. Appl. 129, 46–61. Ricardo da S. Torres is Professor in Visual Computing at the
Skarlat, O., Nardelli, M., Schulte, S., Dustdar, S., 2017. Towards qos-aware fog service Norwegian University of Science and Technology (NTNU). He
placement. In: 2017 IEEE 1st International Conference on Fog and Edge Computing used to hold a position as a Professor at the University of
(ICFEC), pp. 89–96, https://doi.org/10.1109/ICFEC.2017.12. Campinas, Brazil (2005 - 2019). Dr. Torres received a B.Sc.
Tan, P.-N., Steinbach, M., Kumar, V., 2005. Introduction to Data Mining, first ed. in Computer Engineering from the University of Campinas,
Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA. Brazil, in 2000 and his Ph.D. degree in Computer Science at
Tsai, C., Rodrigues, J.J.P.C., 2014. Metaheuristic scheduling for cloud: a survey. IEEE the same university in 2004. Dr. Torres has been developing
Syst. J. 8 (1), 279–291, https://doi.org/10.1109/JSYST.2013.2256731. multidisciplinary eScience research projects involving Mul-
Wang, S., Urgaonkar, R., He, T., Chan, K., Zafer, M., Leung, K.K., 2017. Dynamic service timedia Analysis, Multimedia Retrieval, Machine Learning,
placement for mobile micro-clouds with predicted future costs. IEEE Trans. Parallel Databases, Information Visualisation, and Digital Libraries. Dr.
Distr. Syst. 28 (4), 1002–1016, https://doi.org/10.1109/TPDS.2016.2604814. Torres is author/co-author of more than 200 articles in refer-
Wang, T., Liang, Y., Jia, W., Arif, M., Liu, A., Xie, M., 2019. Coupling resource eed journals and conferences and serves as a PC member for
management based on fog computing in smart city systems. J. Netw. Comput. Appl. several international and national conferences. Currently, he
135, 11–19. has been serving as Senior Associate Editor of the IEEE Signal
Wu, L., Garg, S.K., Buyya, R., Chen, C., Versteeg, S., 2013. Automated sla negotiation Processing Letters and Associate Editor of the Pattern Recog-
framework for cloud computing. In: 2013 13th IEEE/ACM International Symposium nition Letters.
on Cluster, Cloud, and Grid Computing, pp. 235–244, https://doi.org/10.1109/
CCGrid.2013.64. Nelson L. S. da Fonseca received the Ph.D. degree in com-
Wu, F., Wu, Q., Tan, Y., 2015. Workflow scheduling in cloud: a survey. J. Supercomput. puter engineering from the University of Southern California,
71 (9), 3373–3418, https://doi.org/10.1007/s11227-015-1438-4. Los Angeles, CA, USA, in 1994. He is currently a Full Pro-
Xu, Jin, Lam, A.Y.S., Li, V.O.K., 2011. Chemical reaction optimization for task fessor with the Institute of Computing, State University of
scheduling in grid computing. IEEE Trans. Parallel Distr. Syst. 22 (10), 1624–1631, Campinas, Campinas, Brazil. He has authored or coauthored
https://doi.org/10.1109/TPDS.2011.35. over 400 papers and has supervised over 60 graduate stu-
Yang, S., 2017. Iot stream processing and analytics in the fog. IEEE Commun. Mag. 55 dents. Prof. Fonseca is currently the Vice President Technical
(8), 21–27, https://doi.org/10.1109/MCOM.2017.1600840. and Educational Activities of the IEEE Communications Soci-
Zhong, S., Khoshgoftaar, T.M., Seliya, N., 2004. Analyzing software measurement data ety (ComSoc). He served as the ComSoc VicePresident Pub-
with clustering techniques. IEEE Intell. Syst. 19 (2), 20–27. lications, Vice President Member Relations, Director of Con-
Zhu, X., Wu, X., 2004. Class noise vs. Attribute noise: a quantitative study. Artif. Intell. ference Development, Director of Latin America Region, and
Rev. 22 (3), 177–210. Director of On-Line Services. He is the Past Editor-in-Chief of
IEEE Communications Surveys and Tutorials. He is Senior Edi-
tor of the IEEE Communications Magazine, an Editorial Board
Judy C. Guevara received her degree in control engineer-
Member of Computer Networks, Peer-to-Peer Networking and
ing (2009) and M.Sc. degree in Information and Communica-
Applications. He was a recipient of the 2012 IEEE Communica-
tion Sciences (2012) from the Universidad Distrital Francisco
tions Society (ComSoc) Joseph LoCicero Award for Exemplary
José de Caldas, Bogotá, Colombia, and is currently working
Service to Publications, the Medal of the Chancellor of the Uni-
toward the Ph.D. degree at the Institute of Computing, State
versity of Pisa, in 2007, and the Elsevier Computer Network
University of Campinas (Unicamp), Brazil. Her awards include
Journal Editor of Year 2001 Award.
the Young Researchers and Innovators “Virginia Gutiérrez
de Pineda” fellowship (2010), supported by the Colombian
Administrative Department of Science, Technology and Inno-
15