Edge Machine Learning For AI-Enabled IoT Devices A
Edge Machine Learning For AI-Enabled IoT Devices A
Abstract: In a few years, the world will be populated by billions of connected devices that will be
placed in our homes, cities, vehicles, and industries. Devices with limited resources will interact
with the surrounding environment and users. Many of these devices will be based on machine
learning models to decode meaning and behavior behind sensors’ data, to implement accurate
predictions and make decisions. The bottleneck will be the high level of connected things that could
congest the network. Hence, the need to incorporate intelligence on end devices using machine
learning algorithms. Deploying machine learning on such edge devices improves the network
congestion by allowing computations to be performed close to the data sources. The aim of this
work is to provide a review of the main techniques that guarantee the execution of machine learning
models on hardware with low performances in the Internet of Things paradigm, paving the way to
the Internet of Conscious Things. In this work, a detailed review on models, architecture, and
requirements on solutions that implement edge machine learning on Internet of Things devices is
presented, with the main goal to define the state of the art and envisioning development
requirements. Furthermore, an example of edge machine learning implementation on a
microcontroller will be provided, commonly regarded as the machine learning “Hello World”.
Keywords: artificial intelligence; machine learning; Internet of Things; edge devices; deep learning
1. Introduction
The Internet of Things (IoT) scenario [1,2] has gained a lot of notoriety in recent years. It
encompasses an infrastructure of software and hardware that connects the physical world with the
Internet. Due to the explosive growth of interest in this paradigm, the number of IoT devices has
increased dramatically in recent years. It has been estimated that by 2025, more than 75 billion devices
will be connected to the Internet [3], leading to an economic impact on the global market. IoT devices
typically have limited computing power, small memories, and could generate large amounts of data.
Low-power and connected systems including mainly sensors will be used in our homes, cities,
vehicles, and industries. Cloud computing might be suitable for the IoT sector growth but the delay
caused by the data transfer is unacceptable for some tasks (e.g., health monitoring), in addition to
possible bandwidth saturation. So, due to the greater numbers of connected devices, the only-cloud
processing could become impractical and would lead to greater latency, bandwidth decrease, and
privacy and reliability problems [4]. Hence the need to bring the calculation as locally as possible,
incorporating intelligence on end devices to limit cloud traffic. This means giving a sort of
“consciousness” to the devices that become able to interact also in the absence of the connection,
elaborating complex behavior, and adapting to rapidly changing situations, a sort of “Internet of
Conscious Things”. Unfortunately, limitations in the computational capabilities of resource-scarce
devices restrict the implementation of complex machine learning (ML) algorithms on them, although
several frameworks based on software agents provide reliable and effective solutions for the
optimizations of different edge computing implementation [5–7]. The tasks that can be delivered to
the edge elements are related to low-data fusion [8], while for a deeper understanding of the data
(e.g., decision-making purposes) it is necessary to assign the calculation to more efficient systems.
However, the transferring of raw data to cloud servers increases communication costs, causes
delayed system response, and exposes private data. To address these issues, a practical solution is to
consider processing data closer to its sources and transmitting to remote servers only the data needed
for further cloud processing. Edge computing refers to computations being performed as close to
data sources as possible, instead of remote locations.
Both search engines and libraries were used to write the review. As reported on Google Trends
[9], in these years there has been an increase of interest from the scientific community on the topic
and issues of edge computing (Figure 1) and a total of 6342 papers are reported from Scopus [10]. The
keywords employed to obtain the papers to be analyzed was edge computing.
90 89
85 84
82 83
80
78
73
70
68
66 666665
63 64
60 61
59
50 51 51
4949494850
45 4645
40 39
37
3232
30
24 24
21 2222
20 19
16 1517
13 13
10 11 111110
888 7 9 9 9
7 7
5 5 5 444 565655 6
323 33432443243 4423
0
2013 2014 2015 2016 2017 2018 2019 2020
In order to discern which papers to use, only papers written in English were considered.
Furthermore, to observe the recent evolution in this field, the selected papers were published from
the years 2014 to the first months of 2020. Finally, a total of about 100 papers were utilized to compile
this review.
The distribution of the papers regarding the country of the first author providing the countries
that investigate edge computing systems are mainly China, USA, UK, and Italy. China is the country
with the highest number of papers with a total of 2193 papers, 34.7% of the total.
The solution paved from the usage of edge machine learning (eML) is a viable way to meet the
latency, scalability, and privacy challenges described earlier.
In order to make possible this scenario, efficient artificial intelligence (AI) algorithm could be
deployed on the devices as SVM (support vector machine) [11,12], deep learning (DL) [13,14] and,
inter alia, neural networks (NNs) [15,16]. It is noteworthy that NNs require less computational power
in the application phase than in the training phase. This feature can be exploited to execute algorithms
Sensors 2020, 20, 2533 3 of 33
of AI on devices with limited resources, as microcontrollers (MCUs), allowing local data processing.
In fact, in the learning phase, a large amount of data is used to calculate the weights and biases of the
network, thus requiring a performing machine in this phase. Once the learning phase has been
completed and the network has been trained, the model can be used for inference (in statistic field, it
is the process of using data analysis to deduce properties of an underlying probability distribution)
with a lower computational request. In fact, once the coefficients have been calculated, they are stored
in the program memory and the AI algorithm can be executed on a device with a low capacity in
terms of RAM (random access memory). ML algorithms are used for different topics such as smart
cities [5,17–22], computer vision [23–27], health care [28–33], automotive [34–37], and others.
Concerning these fields, there are different examples of how machine learning could be brought on
edge devices. Anandhalli and Baligar in 2017 proposed in [18] a video processing algorithm that
identified and counted the vehicles on a road. The algorithm was run on a Raspberry Pi3 (1.2 GHz
quad-core ARMv8, 1 GB of RAM) with a built-in camera, using the library OpenCV5 (Open Source
Computer Vision Library [38]). In [23], the authors developed a face recognition algorithm for law
enforcement agencies within a smart city. A portable wireless camera mounted on the uniform of a
police officer is used to capture the images that are then passed on to a Raspberry Pi3 to perform
facial recognition. The Viola-Jones algorithm [39] is first used to identify faces in the images, then the
ORB algorithm [40] extracts the peculiar features from the faces that are then transmitted to an SVM
algorithm in the cloud to identify people. Therefore, for IoT purposes, the devices must be sufficiently
powerful to perform certain tasks, even though, in general, it is possible to insert AI even in any
embedded devices by exploiting a certain class of algorithms. The use of ML algorithms also allows
the extension of the average battery life of the device, with power saving one of the fundamental
tasks of the IoT world. For example, authors in [28] focused on increasing the battery life of a device
used for e-health purposes by optimizing sampling times and data transfers using ML algorithms.
Any unnecessary data that are transferred, stored, and processed appear to be a potential waste of
energy. Using an SVM algorithm, based on the RBF (radial basis function) [41], kernel function, and
varying the sampling frequency, it was possible to increase the life of the device from 2 weeks to
years (997 days). In the same scenario considered, the wearable sensor platform for health-care in a
residential environment (SPHERE) [42] is used to classify human activity into three categories:
Sedentary, moderate, and sportive. The data measured by an accelerometer are sent to the MCU using
Serial Peripheral Interface (SPI) protocol. After processing the data, they are packed by the
microcontroller and sent using advertisement mode to a Bluetooth Low Energy (BLE) radio, which
transmits data outside the smart home to a central unit. The use of SVM can also be found in [43],
which highlights how the use of ML algorithms allows greater efficiency and low consumption in
predicting patients’ seizures. In fact, crisis prediction is a difficult task due to the variability of the
electroencephalography (EEG) signal depending on the patient. A neurostimulator able to identify
and react to a principle of crisis can facilitate applications that would not otherwise be feasible, such
as the possibility of generating a stimulus to suppress the crisis itself. The article demonstrates how
ML techniques like SVM can be used to identify possible crises in each patient. The use of NN on
embedded devices can be found in [44], which deals with the use of learning algorithms on an
inexpensive robot built to perform sense-motor tasks. The robot, in particular, learns to trace objects
by identifying the peculiarities of the object itself. The algorithm uses a CNN (convolutional neural
network) to combine color, brightness, motion, and audio information while training is carried out
using supervised ML algorithms. The images are properly analyzed to eliminate redundant input
data. A Motorola board populated with a 68HC11, an 8-bit MCU with Complex Instruction Set
Computer (CISC) architecture is used, which converts and processes the gyroscope outputs and
generates sound feedback and timing signals for 14 servomotors that allow 3 degrees of freedom for
each of the four legs and two degrees of freedom for the robot’s head. A Charge-Coupled Device
(CCD) camera is mounted on the top of the robot, which acts as the robot’s eye.
As seen from the previous papers, ML algorithms can be implemented on devices with limited
computational power and this can be used to improve the IoT field [45,46], thus enabling the edge
computing. A related term, fog computing, describes an architecture where the cloud is extended to be
Sensors 2020, 20, 2533 4 of 33
closer to the IoT end devices, thereby improving latency and security by performing computations
near the network edge [4]. Therefore, also for the fog computing, the task is to bring the processing
phase closer to where the data are generated but the main difference is where the “intelligence” is
located. In fog computing the processing phase is at the LAN (local-area network) level, in a fog node
or IoT gateway. In edge computing, data are mainly processed directly on the devices to which the
sensors are attached (physically very close to the sensors). The closer to the sensor, the better it is in
terms of privacy and power consumption because of the reduction of the energy request related to
data transmission. In this scenario, a variety of possibilities of energy harvesting from different
sources paves the way for AI-enabled passive or semipassive IoT sensor platforms [47–51].
This survey focused on ML systems deployed on edge devices. Section 2 provides a comparison
between the ML algorithms implementable in edge computing. In Section 3, the process of bringing
ML to the edge is analyzed. Section 4 describes edge server-based architectures while in Section 5 the
wireless standards for AI-enabled IoT devices are introduced. Section 6 provides edge-specific
solutions for offloading techniques, detailing the differences of the joint computation alternatives.
Section 7 deals with privacy issues and how to protect user privacy in uploading data. Section 8
describes the edge implementations of the training phase, in ML design. In Section 9, an example of
edge machine learning implementation is provided, commonly regarded as the machine learning
“Hello World”. The conclusions are, finally, drawn in Section 10.
accuracy while minimizing model size and prediction costs. It learns first a single sparse tree reducing
the size model, then it makes nonlinear predictions through the internal nodes and the leaf ones.
Eventually, Bonsai learns sparse matrix, projecting all data into a low dimensional space in which the
tree is learned. This allows the algorithm to be brought on tiny devices like IoT ones.
The referenced implementation was carried out on an Arduino board populated with an 8-bit
ATmega328P microcontroller with 16 MHz operating frequency, 2 kB of Static Random Access
Memory (SRAM), and 32 kB of read-only flash memory and on BBC Micro:Bit which has an ARM
architecture 32-bit Cortex with an operating frequency of 16 MHz, 16 kB of SRAM and 256 kB of flash.
2.4. SVM
One of the most widely used ML algorithms at the embedded level is the SVM [28,29,43,52].
SVM is a supervised learning algorithm that can be used for both classification and regression
problems. The algorithm discriminates between two or more classes of data by defining an optimal
hyperplane that separates all classes (Figure 3a). The support vectors are the data closest to the
hyperplane, which, if removed, would result in a redefinition of the hyperplane itself. For these
reasons they are considered the critical elements of the dataset. Usually, the loss function used by the
algorithm is the Hinge loss and the optimization function is the descending gradient technique.
(a) (b)
Figure 3. (a) Hyperplane that separate two classes of data, (b) kernel trick.
Sometimes the data are linearly separable, but this only represents a subset of cases. SVM can
efficiently perform a classification using the kernel trick. Suppose we face the problem represented
in Figure 3b: It is impossible to find a single line to separate the two classes in the input space. But,
after projecting the data into a higher dimension, it is possible to find the hyperplane which classifies
the data. Kernel helps to find a hyperplane in the higher dimensional space without increasing the
computational cost too much.
3.1. Architectures
To meet latency requirements, different architectures for quick-performing model inference
have been proposed. The research focused on three important architectures (depicted in Figure 4): (1)
On-device computation, where DNNs are executed on the end device; (2) edge server-based
architectures (the data are sent from the end devices to edge servers for computation); and (3) joint
computation which includes the possibility to have cloud processing.
Sensors 2020, 20, 2533 7 of 33
Figure 4. (a) On-device computation, (b) edge server-based architectures, and (c) joint computation.
candidate by hundreds of random evaluations. Yang et al. [76] selected pruning candidate weighted
by energy consumption. Early pruning [77] and dynamic pruning [78] explored how to integrate
pruning with a better retraining, and saved the retraining time. A technique that is not among those
of pruning and quantization, but which has a significant value, is the knowledge distillation. This
involves creating a smaller DNN that imitates the behavior of a larger one [79]. This is done by
training the smaller DNN using the output predictions produced from the larger one and the smaller
DNN approximates the function learned by the larger one.
Table 1. Hardware used for Internet of Things (IoT) devices that implement edge computing.
similar scenario but the proposed solution uses transfer learning to reduce the computational
resources consumed by each request. Transfer learning enables multiple applications to share the
common lower layers of the DNN model and computes higher layers unique to the specific
application, thus reducing the overall amount of computation.
and TCP/IP software and the high power consumption that make it not suitable for use with battery-
powered devices. Now, however, new devices support WiFi and TCP/IP software and have reduced
power consumption. The power consumption of these devices can be further reduced by activating
the radio section only for short periods, allowing them to operate for over a year with two AA
batteries [121].
Many different WiFi protocols are available and operate at either 2.4 GHz or 5 GHz. IEEE 802.11n
and IEEE 802.11ac are the most widely used protocols but different versions have been developed in
the past years for higher versatility. WiFi HaLow (IEEE 802.11ah) is designed for low data rate and
long-range devices [122]. It operates in the sub-GHz ISM band and implements power-saving
techniques, such as target wake time (TWT) that wake up the device at defined intervals for a very
short time. HaLow was released in 2016, but is not yet widely used in commercial products. The
802.11af [123] has the same target applications of HaLow but it relies on unused TV spectrums in
UHF and VHF bands and never took off.
The 802.11ax [124] is a more recent version of WiFi technologies that support higher transfer
speed and also introduce power-saving features such as TWT, making it more attractive for IoT
applications. It also includes features that allow it to extend the range and allows the partition of the
channel into smaller subchannels to reduce the data rates while extending the number of devices that
can be reliably connected to an access point, allowing it to scale up to thousands of devices.
Radio-frequency identification (RFID) is a technology that uses sub-GHz ISM bands, designed
specifically so devices without batteries could send a signal [125]. NFC (near-field communication)
is a protocol used for very close communication [126]. It operates in the 13.56 MHz and is designed
to exchange data with another NFC device, allowing bidirectional communications. The low data
rate and short communication distance make it suitable only for niche IoT applications.
LoRa (long range) [127,128] is a low-power, wide-area network (LPWAN) technology. It is based
on spread spectrum modulation techniques and it could be used for empowering the IoT scenario
[129]. In [130], it is presented a solution of machine learning on edge devices with the use of LoRa as
a low-power transmission protocol. Implementing machine learning with LoRa allow it to reduce
transmitted data by 512 times and extend battery life by 3 times for that specific scenario. Nowadays,
the most common strategy for processing data is the use of the cloud, but the transmission of large
amounts of data requires frequent recharging of the devices, thus negating the prerogatives of the
IoT. In addition, IoT applications could require long-distance data transmission, such as for traffic
monitoring. IoT devices must, therefore, have a low-energy profile and sometimes be able to transmit
over great distances for a given scenario [125,130,131]. Furthermore, IoT devices require edge
processing for bandwidth, latency, and privacy issues. Under these conditions, the efficient use of
data reduction and local processing must be coupled with long-range and small-bandwidth
transmission protocols and this could be obtained using LoRa.
SigFox [132] is another LPWAN solution. It is a narrowband technology and allows the use of
simpler devices that are available from different manufacturers. However, it requires the use of
sophisticated and expensive gateways and access points and the network is controlled by SigFox and
has a fee. On the contrary, LoRa is open and its use is free with no subscriptions and no constraints
on installation of gateways and network servers. However, the production of LoRa radio is a Semtech
exclusive.
LTE (long-term evolution), commonly known as “4G LTE”, is a standard for wireless broadband
communication based on GSM/EDGE and UMTS/HSPA technologies [133]. After the LTE
introduction, it began to compete with the emerging technologies for IoT field such as BLE,
narrowband Internet of Things (NB-IoT), ZigBee, and LoRa. The 4G has improved the capabilities of
cellular networks but it is not fully optimized for IoT applications [134,135].
NB-IoT [136], has been introduced to provide low-cost, low-power, wide-area cellular
connectivity for the Internet of Things. NB-IoT is a standards-based low-power wide-area (LPWA)
technology developed to enable a wide range of new IoT devices and services. NB-IoT significantly
improves the power consumption of user devices, system capacity, and spectrum efficiency,
especially in deep coverage. NB-IoT is built [137] from existing LTE functionalities with essential
Sensors 2020, 20, 2533 12 of 33
simplifications and optimizations. At the physical layer, it occupies 180 kHz of spectrum, which is
substantially smaller than LTE bandwidths of 1.4–20 MHz. At the higher layers, simplified LTE
network functions are supported. Compared to other LPWAN solutions, NB-IoT has the great
advantage of eliminating the need for a specific gateway, so sensor data is sent directly to the cloud
server, reducing infrastructure costs.
LTE Cat-M1 [138] is a LPWAN that enables cellular services for the IoT world. Compared to
NB-IoT, this technology provides higher data rate and the ability to use voice over the network, but
requires more bandwidth and, therefore, the devices are more complex and expensive. NB-IoT and
Cat-M1 have different and somewhat complementary target applications with the former suitable for
small sensors and meters and the latter for devices that require higher data rates and have a higher
power budget.
The 5G networks and standards are expected to solve challenges that are facing 4G networks.
The 5G is the fifth generation of mobile, cellular technologies, networks, and solutions. Although not
just ‘built’ for the Internet of Things (IoT), it will be the major driver of the growth of IoT. The 5G IoT
is a novel [139], intelligent network based on 5G communication, which is designed to connect
sensing regions (sensors) and processing center (cloud) provided by AI algorithms. It presents the
different emerging technologies, involving massive Multiple Input Multiple Output (MIMO)
networks, dense static small-cell networks, and mobile small-cell networks. The 5G fulfills the needs
of the IoT [135]:
- high data rate;
- high scalable and fine-grained networks, to increase network scalability;
- very low latency;
- long battery lifetime, to support billions of low-power and low-cost IoT devices.
Reducing the latency in the communications, 5G eliminates part of the bottleneck related to the
remote execution of ML algorithms [140].
ML developers should properly define the communication technology based on specific design
requirements, as well as architectures, hardware, latency, and strategy of computation [141–143].
Distance
Group Technology Data Rate Works
(Indoor/Outdoor)
Contactless NFC 424 kbps 0–4 cm [126]
Contactless RFID 640 kbps 10–20 m [125]
LPWAN LoRa 0.3 to 50 kbps 5–10 km [127,128,144–148]
LPWAN SigFox 100 or 600 bps 30–50km [143,148–151]
WPAN Zigbee 250 kbps 10–100 m [152–155]
WPAN Z-Wave 100 kbps 100 m [116,156]
WPAN Bluetooth LE 1 Mbps 10 m/50 m [102,157–159]
WPAN Bluetooth 5 2 Mbps 40 m/200 m [160–162]
WPAN ANT 60 kbps 30 m [163]
WiFi IEEE 802.11n 600 Mbps 70 m/250 m [164]
WiFi IEEE 802.11ax 9600 Mbps 30 m/120 m [124]
WiFi IEEE 802.11af 570 Mbps 280 m/1 km [165,166]
WiFi IEEE 802.11ah 347 Mbps 140 m/500 m [122,166,167]
Cellular NB-IoT 200 kbps 280 m/1 km [136,137,150,168]
Cellular LTE-M1 1 Mbps 5–100 km [138]
Cellular 4G/LTE 150 Mbps 15 km [169]
Cellular 5G 10–50 Gbps 2 km [170–172]
6. Joint Computation
Sensors 2020, 20, 2533 13 of 33
Although the edge server can accelerate DNN processing, it is not always necessary to have the
edge devices executing DNNs on edge servers. We will introduce three offloading scenarios (Figure
6): (1) partial offloading of partitioned DNN (the decision is what fraction of the DNN computations
should be offloaded), (2) hierarchical architectures (offloading is performed across a combination of
edge devices, edge servers, and cloud), and (3) distributed computing approaches (the DNN
computation is distributed across multiple peer devices).
inference using shallow portions of the NN at the edge and end devices. Due to the distributed
nature, DDNNs improve the fusion of the data from network sensors, system fault tolerance, and
privacy for users. Generally, a common feature for the edge approaches is that the edge server serves
a limited geographical area, so the input data and, thus, their DNN outputs may be similar.
7. Privacy
Both in edge server-based architectures and in joint computation, the data are exchanged over the
network (e.g., from end device to edge server or from edge server to cloud) and it may contain
sensitive information. This can lead to privacy issues. In fact, as already mentioned, edge servers
work locally in a geographically limited area. Therefore, the origin of the data is practically known.
Although ML on edge devices allows the data reduction on the network and, therefore, improving
privacy by itself, it is possible to improve the system through additional techniques, such as adding
noise to data or cryptographic techniques.
8. Training
Thus far, edge computing and deep learning have mostly been discussed assuming that a deep
learning model has already been trained offline on a prebuilt dataset. This section presents a
discussion on training algorithms and hardware for the edge field. Usually, training data produced
by end devices would be sent to the cloud, which would then perform the training and finally
distribute the trained model back to the edge devices. Leaving data at the edge is useful when privacy
is highly desired and also helps to reduce the network bandwidth requirements.
Sensors 2020, 20, 2533 15 of 33
Updating the neural network or computing complex algorithms cannot be completely entrusted
to tiny hardware like microcontrollers. Field-programmable gate array (FPGA) and graphical
processing unit (GPU) consume too much power (the FPGA is still a better choice than the GPU since
it is versatile and consumes less power), but they turn out to be excellent for training NNs or
performing powerful algorithms [186]. GPUs use temporal architectures such as SIMD (single
instruction multiple data) or SIMT (single instruction multiple threads) to perform the MACs in
parallel and there are software libraries designed for GPUs that optimize the matrix multiplications
e.g., NVIDIA CUDA® Basic Linear Algebra Subprograms (cuBLAS) [187], NVIDIA CUDA® Deep
Neural Network library (cuDNN) [188]. The matrix multiplications on these platforms can be further
improved by applying transforms to the data to reduce the number of multiplications. Fast Fourier
transform (FFT) [189] is a well-known approach that reduces the number of multiplications from
O((N2o)(N2f)) to O(N2olog2No), where the output size is No*No and the filter size is Nf *Nf; however, the
benefits of FFTs decrease with filter size. Other approaches include Strassen [190] and Winograd
[191].
Recently, some very interesting devices are emerging, such as the Hailo-8 DL [192] from the
Hailo company. The Hailo-8 DL is a processor suitable for performing deep-learning at high levels
and allows for very high performance on end devices with minimum power consumption, size, and
costs. In particular, it offers high performance (26 tera-operations per second) and is very efficient
and highly flexible (reprogrammable). Google has developed an ASIC (application-specific
integrated circuit) dedicated to the TensorFlow library TPU (Tensor processing unit) [193], whose
computational capacity is 180 teraflops. These are examples of IA accelerators, such as NPU, that is a
class of microprocessors designed to provide hardware acceleration to artificial NNs, automatic
vision, and ML algorithms for robotics, IoT, and other data-based applications [194]. While hardware
DNN accelerators are quite new, there have already been two branches of designs. The first class of
accelerators only looked at the data flow, ignoring the memory energy consumption. The second one
tried to address the amount of energy consumption due to memory access. The first style of
accelerators include ConvNet Processor (CNP) [195], Neuflow [196], and dynamically configurable
(DC) CNN [197], proposing customized logic to map convolution to hardware with more parallelism
and flexibility. The second wave of accelerators focused on optimizing memory transfer and data
movement. As modern NNs get larger, researchers realize that memory access and moving data is
more critical than matrix products between layers. Among these accelerators (TPU is included in this
class of accelerators), DianNao [198] implements an array of multiply-add units to map large DNNs
onto its core architecture. It has customized on-chip buffer to minimize Dynamic Random Access
Memory (DRAM) traffic. DaDianNao [199] and ShiDianNao [200] eliminate the DRAM access by
having all weights on-chip. An interesting AI accelerator is Movidius stick [201] suited for edge
computing because it makes easy to add deep learning capabilities to existing computing platforms.
It is designed mainly for computer vision tasks at the edge [202] and allows deploying CNNs on low-
power applications that require real-time inferencing. A detailed guide on the use is reported in [203].
Another AI accelerator is Coral [204]. It is a platform from Google that allows realizing devices
with local AI, providing hardware acceleration for neural networks at the edge of the network
without any help from the clouds. At the base of Coral there is Google’s Edge TPU, an ASIC chip
optimized to run lightweight machine learning algorithms. Many applications are reported by Coral
project itself [205]. A selection of AI accelerator devices that implement edge computing is reported
in Table 3.
Table 3. Artificial Intelligence (AI) accelerator devices that implement edge computing.
9. MNIST Example
In this section we will analyze how it is possible to port a NN to the embedded environment
using one of the most famous models in machine learning world: the MNIST, the ML “Hello World”.
We will show how to create the neural network using the Tensorflow library and subsequently Keras,
using techniques such as pruning and quantization to reduce the size of the model. Finally, the
algorithm will be implemented on the NUCLEO-F746ZG board through the X-CUBE-AI tool.
9.1. Dataset
The dataset MNIST was developed by Yann LeCun [216], Director of the Facebook Research
Center for Artificial Intelligence, to recognize numeric digits. The dataset was created from a series
of documents made available by the NIST (National Institute of Standards and Technology) [217] and
the images were normalized in size and centered. In particular, the dataset provides 28 × 28
handwritten images with a total of 784 pixels per image, with a splitting of the dataset to implement
training and evaluation of the model to overcome the overfitting issue. The training set consists of
60,000 samples and the test set of 10,000 samples. The objective is to write an algorithm that allows
recognizing which digit has been written. Since there are 10 types of digits (numbers from 0 to 9), the
problem can be seen as clustering task with 10 possible classes. In a first instance, we will show the
realization of the NN using a DNN with 2 hidden layers between input and output; then we will
show how to implement the same problem using a CNN and Dropout [218] to increase the accuracy
of the model.
The loss function used, being a classification problem, is the cross entropy applied directly on the
softmax. The function tf.nn.softmax_cross_entropy_with_logits(logits, labels), combines the two operations
Sensors 2020, 20, 2533 18 of 33
making it faster, but also numerically stable. Instead, as optimization function it is possible to use an
adaptive function, i.e., ADAM. Once the model is defined, we have all the requirements for the training
of the NN. Before starting the training, it is necessary to initialize the variables (weights and biases)
using the tf.global_variables_initializer() method. Tensorflow uses Xavier [219] as default initialization.
Then, it is necessary to define the size of the batches, their number as a function of the size of the dataset,
and a threshold for the loss function related to the validation dataset, so that if the error increases, early
stopping prevents overfitting. It should be noted that the threshold has been set at a high value, so that
early stopping will not occur at the first time. The final part of the required code is related to the actual
training, realized through a cycle. For each period, which is repeated to be a complete iteration of the
dataset, a relative to the batches is defined, through which it is possible to calculate the average error
relative to the single period as the sum of the errors associated with the batches on the number of
batches. At the end of each epoch, it is possible to calculate the loss relative to the validation dataset; if
the current validation error is higher than the previous one, then the model has conformed too much to
the dataset and, therefore, has no ability to adapt and it is necessary to stop learning using early
stopping.
The model made with Tensorflow is too heavy in terms of memory occupation for an edge
application; in this example the NN weighed around 15 MB. Tensorflow Lite (TFLite) [220] was
created specifically to overcome this problem, proposing a set of tools that help programmers to run
embedded, mobile, and IoT devices IA models. In the following, we will show how to use TFLite to
bring the NN on a microcontroller. The workflow that we will follow in this tutorial is the following:
- Definition of the model in Keras (using Tensorflow backend),
- Conversion of the model from Keras to TFLite,
- Implementation of a post-training quantization to further decrease the dimension of the
NN,
- Design of a Graphical User Interface (GUI) to draw the digit, and
- Test on hardware devices.
After defining the model and parameters for the training phase, it is possible to train the
network. With the presented configuration, it is possible to observe that the use of a CNN is able to
increase the model accuracy from 96% to 99%.
model = ‘Model_Keras_MNIST_CNN_Test.h5’.
converter = tf.lite.TFLiteConverter.from_keras_model_file(model)
tflite_model = converter.convert()
Tensorflow provides tools (Tensorflow Model Optimization Toolkit) for the optimization of the
model. The toolkits support techniques used to:
- Reduce latency and inference costs, and
- Implement IA models on edge devices with limited capacity and low-power profile.
These techniques include post-training quantization and pruning techniques. Unfortunately, the
quantization of TFLite models is not supported by X-CUBE-AI and, therefore, we selected the NN
compression adopted by the ST software.
9.5. Pruning
The reduction of the model can be obtained not only with quantization techniques, but also with
pruning techniques that allow eliminating connections not essential for the NN and consequently
reduce the number of computations and the demand of memory space for the NN. Also, for this
purpose, it is possible to use the libraries provided by TensorFlow and their examples [224]. As
discussed in the previous paragraph, the network is redefined importing the tensor-
flow_model_optimization Application Programming Interfaces (APIs). The APIs can be applied either
to the single layer or to the whole model. In our example, we applied the APIs to the single layer. The
pruning technique consists of iteratively removing connections between layers, given a sparsity
parameter (percentage of weights eliminated) and scheduling (pruning frequency). To help the
model convergence, connections should not be eliminated immediately but every tot; in this example
we set the elimination starting from 2000 step every 100 steps. Next, it is necessary to define, among
the pruning parameters, the end step. Then, the model is defined by setting the pruning parameters
and applying them to the NN. Finally, it is possible to convert the model and make the quantization.
The technique reduces the number of parameters and the computations, preserving the model’s
accuracy in terms of predictions. The main impact is due to quantization, but also pruning contributes
to this purpose by increasing the inference speed and reducing the amount of energy used, thus
allowing the use of the IA model on devices with low energy profile and low computational power.
input data. The GUI can be made with PIL [225] (Python Imaging Library), Tkinter [226], de facto
standard of GUIs in Python. The GUI allows drawing the numeric digit using the mouse motion on
the canvas object (Figure 7). Once the drawing is finished, it is possible to export the image as csv, a
format supported for the validation on target made by X-CUBE-AI tool.
Finally, it is possible to proceed to the validation to compare the defined model with the one
generated in C language by the ST tool, feeding both models the same set of data. The validation can be
carried out as:
- Validation on desktop: The model in C is executed on the PC.
- Validation on target: The generated model is executed on the device of interest. It is
necessary to load the code on the microcontroller and set a serial communication to
communicate with the host.
In both cases the data can be either randomly generated by the tool or can be imported from
outside as csv file. After loading the code on the microcontroller, it is possible to enable the validation
on the target, and to load the data generated using the GUI as input. The STM32CubeMx reports the
model results and it is possible to notice that, in this case, the NN allows effectively recognizing the
numeric digit between 10 classes (Figure 9) with an accuracy of 100.00%, root-mean-square error
(rmse) = 0.0000, and medium average error (mae) = 0.0000. In Figure 10, the results obtained during
validation are reported; the calculation took about 330 ms and the execution time layer by layer is
shown in Table 7.
With this example, the full implementation of an eML application of image recognition has been
designed and put into practice with good performances.
Sensors 2020, 20, 2533 22 of 33
Figure 9. Inference result showing the recognition of the digit “6” drawn by the user
(accuracy = 100.00%, root-mean-square error (rmse)= 0.0000, medium average error (mae) = 0.0000, 10
classes, 1 sample).
Description Shape ms
10004/(2D Convolutional) (26, 26, 32) 9.328
10011/(Merged Conv2d/Pool) (12, 12, 64) 299.524
10005/(Dense) (1, 1, 64) 19.562
10009/(Nonlinearity) (1, 1, 64) 0.006
10005/(Dense) (1, 1, 10) 0.022
10009/(Nonlinearity) (1, 1, 10) 0.014
328.458 (total)
10. Conclusions
Deploying machine learning on Internet of Things devices reduces the network congestion by
allowing computations to be performed close to the data sources, preserving privacy in uploading
data, and reducing power consumption for continuous wireless transmission to gateways or cloud
servers. The aim of this work was to provide a review of the main techniques that guarantee the
execution of machine learning models on hardware with low performances in the Internet of Things
paradigm, paving the way to the Internet of Conscious Things.
Sensors 2020, 20, 2533 23 of 33
In this work, a detailed review on models, architectures, and requirements on solutions that
implement edge machine learning on IoT devices was presented, with the main goal to define the
state of the art and envisioning development requirements.
The review focused on ML systems deployed on edge devices, providing a comparison between
the ML algorithms implementable in edge computing. In addition, the process of bringing ML to the
edge was analyzed in detail, considering edge server-based architectures and joint computation, thus
envisioning both the case of the absence (and the related effect on privacy and local computational
operations) and the presence (and how it impacts on cloud/edge server communications and remote
data transmission power consumption) of data transmission to gateways or servers.
The actual state of development of edge computing foresees a series of variegated solutions able
to satisfy a plurality of needs. Depending on the requirements (privacy, energy consumption,
computational complexity), it is possible to define a set of compatible hardware and software to
implement AI-enabled IoT effective solutions.
An example of edge machine learning implementation is provided in the review, demonstrating
the effectiveness and ease of use of the proper edge-platform used for implementing the machine
learning “Hello World”.
Author Contributions: Conceptualization, M.M.; methodology, M.M.; software, C.P.; validation, C.P., D.I. and
M.M.; formal analysis, M.M. and C.P.; investigation, C.P.; resources, M.M.; writing—original draft preparation,
C.P., D.I. and M.M.; writing—review and editing, C.P., D.I. and M.M.; supervision, M.M.; project administration,
M.M. All authors have read and agreed to the published version of the manuscript.
Acknowledgments: Programma di Azione Coesione PAC Calabria 2014–2020, Asse Prioritario 12, Azione 10.5.12, is
gratefully acknowledged by one of the authors (D.I.).
References
1. Atzori, L.; Iera, A.; Morabito, G. The Internet of Things: A survey. Comput. Networks 2010, 54, 2787–2805.
2. Mahdavinejad, M.S.; Rezvan, M.; Barekatain, M.; Adibi, P.; Barnaghi, P.; Sheth, A.P. Machine learning for
internet of things data analysis: A survey. Digit. Commun. Networks 2018, 4, 161–175.
3. IoT: Number of Connected Devices Worldwide 2012–2025 | Statista. Available online:
https://www.statista.com/statistics/471264/iot-number-of-connected-devices-worldwide/ (accessed on 21
February 2020).
4. Vahid Dastjerdi, A.; Buyya, R. Fog Computing: Helping the Internet of Things Realize. IEEE Comput. Soc.
2016, 49, 112–116.
5. Liu, Y.; Yang, C.; Jiang, L.; Xie, S.; Zhang, Y. Intelligent Edge Computing for IoT-Based Energy Management
in Smart Cities. IEEE Netw. 2019, 33, 111–117.
6. Wang, X.; Han, Y.; Wang, C.; Zhao, Q.; Chen, X.; Chen, M. In-edge AI: Intelligentizing mobile edge
computing, caching and communication by federated learning. IEEE Netw. 2019, 33, 156–165.
7. Savaglio, C.; Ganzha, M.; Paprzycki, M.; Bădică, C.; Ivanović, M.; Fortino, G. Agent-based Internet of
Things: State-of-the-art and research challenges. Futur. Gener. Comput. Syst. 2020, 102, 1038–1053.
8. Neto, A.R.; Soares, B.; Barbalho, F.; Santos, L.; Batista, T.; Delicato, F.C.; Pires, P.F. Classifying Smart IoT
Devices for Running Machine Learning Algorithms. In Anais do XLV Seminário Integrado de Software e
Hardware; SBC: Nashville, TN, USA, 2018.
9. Edge Computing—Explore—Google Trends. Available online:
https://trends.google.com/trends/explore?date=all&q=edge computing (accessed on 5 March 2020).
10. Scopus Preview - Scopus - Welcome to Scopus. Available online: https://www.scopus.com/ (accessed on 15
March 2020).
11. 1.4. Support Vector Machines—Scikit-Learn 0.22.2 Documentation. Available online: https://scikit-
learn.org/stable/modules/svm.html (accessed on 5 March 2020).
12. Guestrin, C. SVMs, Duality and the Kernel Trick. Mach. Learn. 2006, 10701, 15781.
13. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Networks 2015, 61, 85–117.
14. Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444
Sensors 2020, 20, 2533 24 of 33
15. Neapolitan, R.E.; Jiang, X. Neural Networks and Deep Learning. In Artificial Intelligence; CRC Press Taylor&
Francis Group: Boca Raton, FL, USA, 2018.
16. Jordan, M.I.; Bishop, C.M. Neural networks. In Computer Science Handbook, 2nd ed.; CRC Press: Boca Raton,
FL, USA, 2004; ISBN 9780203494455.
17. Merenda, M.; Praticò, F.G.; Fedele, R.; Carotenuto, R.; Corte, D.; Della Corte, F.G. A Real-Time Decision
Platform for the Management of Structures and Infrastructures. Electronics 2019, 8, 1180.
18. Anandhalli, M.; Baligar, V.P. A novel approach in real-time vehicle detection and tracking using Raspberry
Pi. Alex. Eng. J. 2018, 57, 1597–1607.
19. Arasteh, H.; Hosseinnezhad, V.; Loia, V.; Tommasetti, A.; Troisi, O.; Shafie-Khah, M.; Siano, P. Iot-based
smart cities: A survey. In Proceedings of the 2016 IEEE 16th International Conference on Environment and
Electrical Engineering (EEEIC), Florence, Italy, 7–10 June 2016; pp. 2–7.
20. Zanella, A.; Bui, N.; Castellani, A.; Vangelista, L.; Zorzi, M. Internet of Things for Smart Cities. IEEE Internet
Things J. 2014, 1, 22–32.
21. Bibri, S.E. The IoT for smart sustainable cities of the future: An analytical framework for sensor-based big
data applications for environmental sustainability. Sustain. Cities Soc. 2018, 38, 230–253.
22. Kim, T. hoon; Ramos, C.; Mohammed, S. Smart City and IoT. Futur. Gener. Comput. Syst. 2017, 76, 159–162.
23. Sajjad, M.; Nasir, M.; Muhammad, K.; Khan, S.; Jan, Z.; Sangaiah, A.K.; Elhoseny, M.; Baik, S.W. Raspberry
Pi assisted face recognition framework for enhanced law-enforcement services in smart cities. Futur. Gener.
Comput. Syst. 2017, 108, 995–1007.
24. Zhang, T.; Chowdhery, A.; Bahl, P.; Jamieson, K.; Banerjee, S. The design and implementation of a wireless
video surveillance system. In Proceedings of the Annual International Conference on Mobile Computing
and Networking, MobiCom’15: The 21th Annual International Conference on Mobile Computing and
Networking, Paris, France, 7–11 September 2015.
25. Borgia, E. The Internet of Things vision: Key features, applications and open issues. Comput. Commun. 2014,
54, 1–31.
26. Magrini, M.; Moroni, D.; Palazzese, G.; Pieri, G.; Leone, G.; Salvetti, O. Computer Vision on Embedded
Sensors for Traffic Flow Monitoring. In Proceedings of the 2015 IEEE 18th International Conference on
Intelligent Transportation Systems, Las Palmas, Spain, 15–18 September 2015; pp. 161–166.
27. Arshad, B.; Ogie, R.; Barthelemy, J.; Pradhan, B.; Verstaevel, N.; Perez, P. Computer Vision and IoT-Based
Sensors in Flood Monitoring and Mapping: A Systematic Review. Sensors 2019, 19, 5012.
28. Fafoutis, X.; Marchegiani, L.; Elsts, A.; Pope, J.; Piechocki, R.; Craddock, I. Extending the battery lifetime of
wearable sensors with embedded machine learning. In Proceedings of the 2018 IEEE 4th World Forum on
Internet of Things (WF-IoT), Singapore, Singapore, 5–8 February 2018; pp. 269–274.
29. Ieracitano, C.; Mammone, N.; Hussain, A.; Morabito, F.C. A novel multi-modal machine learning based
approach for automatic classification of EEG recordings in dementia. Neural Networks 2020, 123, 176–190.
30. Rajkomar, A.; Dean, J.; Kohane, I. Machine learning in medicine. N. Engl. J. Med. 2019, 380, 1347–1358.
31. Ngiam, K.Y.; Khor, I.W. Big data and machine learning algorithms for health-care delivery. Lancet Oncol.
2019, 20, e262–e273.
32. Beam, A.L.; Kohane, I. Big Data and Machine Learning in Health Care. JAMA 2018, 319, 1317.
33. Amato, F.; López, A.; Peña-Méndez, E.M.; Vanhara, P.; Hampl, A.; Havel, J. Artificial neural networks in
medical diagnosis. J. Appl. Biomed. 2013, 11, 47–58.
34. Syafrudin, M.; Alfian, G.; Fitriyani, N.L.; Rhee, J. Performance Analysis of IoT-Based Sensor, Big Data
Processing, and Machine Learning Model for Real-Time Monitoring System in Automotive Manufacturing.
Sensors 2018, 18, 2946.
35. Kihei, B.; Copeland, J.A.; Chang, Y. Automotive Doppler sensing: The Doppler profile with machine
learning in vehicle-to-vehicle networks for road safety. In Proceedings of the IEEE Workshop on Signal
Processing Advances in Wireless Communications, SPAWC, Sapporo, Japan, 3–6 July 2017.
36. Gharib, M.; Lollini, P.; Botta, M.; Amparore, E.; Donatelli, S.; Bondavalli, A. On the Safety of Automotive
Systems Incorporating Machine Learning Based Components: A Position Paper. In Proceedings of the 48th
Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, DSN-W
2018, Luxembourg City, Luxembourg, 25–28 June 2018.
37. Luckow, A.; Kennedy, K.; Manhardt, F.; Djerekarov, E.; Vorster, B.; Apon, A. Automotive big data:
Applications, workloads and infrastructures. In Proceedings of the 2015 IEEE International Conference on
Big Data, IEEE Big Data 2015, Santa Clara, CA, USA, 29 October–1 November 2015.
Sensors 2020, 20, 2533 25 of 33
61. Altman, N.S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 1992, 46,
175–185.
62. Gupta, C.; Suggala, A.S.; Goyal, A.; Simhadri, H.V.; Paranjape, B.; Kumar, A.; Goya, S.; Udupa, R.; Varma,
M.; Jain, P. ProtoNN: Compressed and accurate kNN for resource-scarce devices. In Proceedings of the
34th International Conference on Machine Learning ICML, Sydney, Australia, 6–11 August 2017; Volume
3, pp. 2144–2159.
63. Gope, D.; Dasika, G.; Mattina, M. Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT
Applications. arXiv 2019, arXiv:1903.01531.
64. Kumar, A.; Goyal, S.; Varma, M. Resource-efficient machine learning in 2 KB RAM for the Internet of
Things. In Proceedings of the 34th International Conference on Machine Learning ICML, Sydney, Australia,
6–11 August 2017; Volume 4, pp. 3062–3071.
65. Haigh, K.Z.; Mackay, A.M.; Cook, M.R.; Lin, L.G. Machine Learning for Embedded Systems : A Case Study;
Technical Report; BBN Technologies: Cambridge, MA, USA, 2015.
66. Chen, J.; Ran, X. Deep Learning With Edge Computing: A Review. Proc. IEEE 2019, 107, 1655–1674.
67. Sze, V.; Chen, Y.H.; Emer, J.; Suleiman, A.; Zhang, Z. Hardware for machine learning: Challenges and
opportunities. In Proceedings of the 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX,
USA, 30 April–3 May 2017; pp.1–8.
68. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H.
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017,
arXiv:1704.04861.
69. Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level
accuracy with 50x fewer parameters and <0.5MB model size. arXiv 2016, arXiv:1602.07360; pp. 1–13.
70. Han, S.; Mao, H.; Dally, W.J. Deep Compression: Compressing Deep Neural Networks with Pruning,
Trained Quantization and Huffman Coding. arXiv 2015, arXiv:1510.00149; pp. 1–14.
71. TensorFlow. Available online: https://www.tensorflow.org/ (accessed on 11 November 2019).
72. Home—Keras Documentation. Available online: https://keras.io/ (accessed on 11 November 2019).
73. Yao, S.; Zhao, Y.; Zhang, A.; Su, L.; Abdelzaher, T. DeepIoT: Compressing Deep Neural Network Structures
for Sensing Systems with a Compressor-Critic Framework. In Proceedings of the 15th ACM Conference on
Embedded Network Sensor Systems, Delft, The Netherlands, 6–8 November 2017.
74. Molchanov, P.; Tyree, S.; Karras, T.; Aila, T.; Kautz, J. Pruning Convolutional Neural Networks for Resource
Efficient Inference. arXiv 2016, arXiv:1611.06440; pp. 6–10.
75. Anwar, S.; Sung, W. Compact Deep Convolutional Neural Networks With Coarse Pruning. 2016, 1, 1–10.
76. Yang, T.J.; Chen, Y.H.; Sze, V. Designing energy-efficient convolutional neural networks using energy-
aware pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6071–6079.
77. Narang, S.; Elsen, E.; Diamos, G.; Sengupta, S. Exploring Sparsity in Recurrent Neural Networks. arXiv
2017, arXiv:1704.05119; pp. 1–10.
78. Guo, Y.; Yao, A.; Chen, Y. Dynamic network surgery for efficient DNNs. In Proceedings of the 30th
International Conference on Neural Information Processing Systems (NIPS’16); Curran Associates Inc.: Red
Hook, NY, USA, 2016; pp. 1387–1395.
79. Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. arXiv 2015,
arXiv:1503.02531; pp. 1–9.
80. X-CUBE-AI—AI Expansion Pack for STM32CubeMX—STMicroelectronics. Available online:
https://www.st.com/en/embedded-software/x-cube-ai.html#overview (accessed on 18 November 2019).
81. TensorFlow Lite for Microcontrollers. Available online: https://www.tensorflow.org/lite/microcontrollers
(accessed on 15 March 2020).
82. Arduino Nano 33 BLE Sense with Headers | Arduino Official Store. Available online:
https://store.arduino.cc/arduino-nano-33-ble-sense-with-headers (accessed on 15 March 2020).
83. SparkFun Edge Development Board—Apollo3 Blue—DEV-15170—SparkFun Electronics. Available online:
https://www.sparkfun.com/products/15170 (accessed on 15 March 2020).
84. Artificial Intelligence (AI)—STMicroelectronics. Available online:
https://www.st.com/content/st_com/en/about/innovation---technology/artificial-intelligence.html
(accessed on 15 March 2020).
Sensors 2020, 20, 2533 27 of 33
85. Adafruit EdgeBadge—TensorFlow Lite for Microcontrollers ID: 4400—$35.95 : Adafruit Industries, Unique
& fun DIY electronics and kits. Available online: https://www.adafruit.com/product/4400 (accessed on 15
March 2020).
86. Overview | Espressif Systems. Available online: https://www.espressif.com/en/products/hardware/esp32-
devkitc/overview (accessed on 15 March 2020).
87. Overview | Espressif Systems. Available online: https://www.espressif.com/en/products/hardware/esp-
eye/overview (accessed on 15 March 2020).
88. High-Performing AI Solutions to Transform our Digital World—Arm. Available online:
https://www.arm.com/solutions/artificial-intelligence (accessed on 15 March 2020).
89. New AI technology from Arm delivers intelligence for IoT—Arm. Available online:
https://www.arm.com/company/news/2020/02/new-ai-technology-from-arm (accessed on 15 March 2020).
90. Zhao, Z.; Barijough, K.M.; Gerstlauer, A. DeepThings: Distributed adaptive deep learning inference on
resource-constrained IoT edge clusters. IEEE Trans. Comput. Des. Integr. Circuits Syst. 2018, 37, 2348–2359.
91. Nikouei, S.Y.; Chen, Y.; Song, S.; Xu, R.; Choi, B.Y.; Faughnan, T. Smart surveillance as an edge network
service: From harr-cascade, SVM to a Lightweight CNN. In Proceedings of the 2018 IEEE 4th International
Conference on Collaboration and Internet Computing (CIC), Philadelphia, PA, USA, 18–20 October 2018;
pp. 256–265.
92. Xu, R.; Nikouei, S.Y.; Chen, Y.; Polunchenko, A.; Song, S.; Deng, C.; Faughnan, T.R. Real-Time Human
Objects Tracking for Smart Surveillance at the Edge. In Proceedings of the IEEE International Conference
on Communications, Kansas City, MO, USA, 20–24 May 2018..
93. Chand, G.; Ali, M.; Barmada, B.; Liesaputra, V.; Ramirez-Prado, G. Tracking a person’s behaviour in a smart
house. In International Conference on Service-Oriented Computing; Lecture Notes in
Computer Science; Springer: Cham, Switzerland, 2019.
94. Rosato, D.; Comai, S.; Masciadri, A.; Salice, F. Non-invasive monitoring system to detect sitting people. In
Proceedings of the 4th EAI International Conference on Smart Objects and Technologies for Social Good,
Bologna, Italy, 28–30 November 2018; pp. 261–264.
95. SparkFun Edge Hookup Guide—learn.sparkfun.com. Available online:
https://learn.sparkfun.com/tutorials/sparkfun-edge-hookup-guide/all (accessed on 14 April 2020).
96. Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J.
2016, 3, 637–646.
97. Yu-han, T.; Ravindranath, L.; Deng, S.; Chen, T.Y. Continuous, Real-Time Object Recognition on Mobile
Devices Categories and Subject Descriptors, In Proceedings of the 13th ACM Conference on Embedded
Networked Sensor Systems (SenSys ‘15). Association for Computing Machinery, New York, NY, USA,
2015; pp. 155–168.
98. Hung, C.C.; Ananthanarayanan, G.; Bodik, P.; Golubchik, L.; Yu, M.; Bahl, P.; Philipose, M. VideoEdge:
Processing camera streams using hierarchical clusters. In Proceedings of the 2018 IEEE/ACM Symposium
on Edge Computing (SEC), Seattle, WA, USA, 25–27 October 2018; pp. 115–131.
99. Jiang, A.H.; Wong, D.L.K.; Canel, C.; Tang, L.; Misra, I.; Kaminsky, M.; Kozuch, M.A.; Pillai, P.; Labs, I.;
Andersen, D.G.; et al. Mainstream : Dynamic Stem-Sharing for Multi-Tenant Video Processing. In
Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC 18), Boston, MA, USA, 11–
13 July 2018.
100. Radio Regulations. Available online: https://www.itu.int/pub/R-REG-RR/en (accessed on 22 February
2020).
101. Radio Versions | Bluetooth® Technology Website. Available online: https://www.bluetooth.com/learn-
about-bluetooth/bluetooth-technology/radio-versions/ (accessed on 11 March 2020).
102. Dekimpe, R.; Xu, P.; Schramme, M.; Flandre, D.; Bol, D. A Battery-Less BLE IoT Motion Detector Supplied
by 2.45-GHz Wireless Power Transfer. In Proceedings of the 2018 IEEE 28th International Symposium on
Power and Timing Modeling, Optimization and Simulation, PATMOS, Platja d’Aro, Spain, 2–4 July 2018;
Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2018; pp. 68–75.
103. Bluetooth 5: Go Faster, Go Further. Available online: https://www.bluetooth.com/wp-
content/uploads/2019/03/Bluetooth_5-FINAL.pdf (accessed on 9 April 2020).
104. Bluetooth Special Interest Group (SIG). Bluetooth Core Specification Version 5.0. In Bluetooth Core Specif.
Version 5.2; Bluetooth Special Interest Group (SIG): Kirkland, WA, USA, 2019. Available online:
Sensors 2020, 20, 2533 28 of 33
https://www.bluetooth.com/wp-content/uploads/2020/01/Bluetooth_5.2_Feature_Overview.pdf (accessed
on 20 April 2020).
105. Zigbee Alliance Website. Available online: https://zigbeealliance.org/ (accessed on 11 March 2020).
106. 802.15.4v-2017—IEEE Standard for Low-Rate Wireless Networks—Amendment 5: Enabling/Updating the
Use of Regional Sub-GHz Bands. Available online: https://standards.ieee.org/standard/802_15_4v-
2017.html (accessed on 11 March 2020).
107. Pan, M.S.; Tseng, Y.C. ZigBee and Their Applications. In Sensor Networks and Configuration, Fundamentals,
Standards, Platforms, and Applications; Springer: Berlin/Heidelberg, Germany, 2007; Volume 16, pp.
349–368.
108. Islam, M.L.; Faizan, C.; Quavi, S.M.A. IOT Based Smart Garbage Monitoring System. Int. J. Comput. Sci.
Eng. 2019, 7, 649–651.
109. Yaqoob, I.; Hashem, I.A.T.; Mehmood, Y.; Gani, A.; Mokhtar, S.; Guizani, S. Enabling communication
technologies for smart cities. IEEE Commun. Mag. 2017, 55, 112–120.
110. Sahitya, G.; Balaji, N.; Naidu, C.D.; Abinaya, S. Designing a wireless sensor network for precision
agriculture using zigbee. In Proceedings of the 7th IEEE International Advanced Computing Conference,
IACC 2017, Hyderabad, India, 5–7 January 2017.
111. Hidayat, T. Internet of Things Smart Agriculture on ZigBee: A Systematic Review. J. Telekomun. dan
Kompʹût. 2017, 8, 75–86.
112. Lei, Y.; Wang, T.; Wu, J. Vehicles relative positioning based on ZigBee and GPS technology. In Proceedings
of the ICEIEC 2016 IEEE 6th International Conference on Electronics Information and Emergency
Communication, Beijing, China, 17–19 June 2016.
113. Dong, C.; Chen, X.; Dong, H.; Yang, K.; Guo, J.; Bai, Y. Research on intelligent vehicle infrastructure
cooperative system based on zigbee. In Proceedings of the 2019 5th International Conference on
Transportation Information and Safety (ICTIS), Liverpool, UK, 14–17 July 2019; pp. 1337–1343.
114. Lee, H.J.; Lee, S.H.; Ha, K.S.; Jang, H.C.; Chung, W.-Y.C.; Kim, J.Y.; Chang, Y.-S.; Yoo, D.H. Ubiquitous
healthcare service using Zigbee and mobile phone for elderly patients. Int. J. Med Inform. 2009, 78, 193–198.
115. Chae, M.J.; Yoo, H.; Kim, J.; Cho, M. Development of a wireless sensor network system for suspension
bridge health monitoring. Autom. Constr. 2012, 21, 237–252.
116. Z-Wave | Safer, Smarter Homes Start with Z-Wave. Available online: https://www.z-wave.com/ (accessed
on 12 March 2020).
117. ANT Protocol | Dynastream Innovations. Available online: https://www.dynastream.com/solutions/ant-
wireless/ (accessed on 9 April 2020).
118. What is ANT+—THIS IS ANT. Available online: https://www.thisisant.com/consumer/ant-101/what-is-ant/
(accessed on 16 April 2020).
119. Mulligan, G. The 6LoWPAN architecture. In Proceedings of the 4th Workshop on Embedded Networked
Sensors, EmNets, Cork, Ireland, 25–26 June 2007.
120. Unwala, I.; Taqvi, Z.; Lu, J. Thread: An IoT protocol. In Proceedings of the IEEE Green Technologies
Conference, Austin, TX, USA, 4–6 April 2018.
121. Shop Humidor Monitoring from Smartphone and Tablet Habueno. Available online:
https://www.habueno.com/shop/?lang=en (accessed on 16 April 2020).
122. WiFi HaLow | WiFi Alliance. Available online: https://www.WiFi.org/discover-WiFi/WiFi-halow
(accessed on 12 March 2020).
123. Flores, A.B.; Guerra, R.E.; Knightly, E.W.; Ecclesine, P.; Pandey, S. IEEE 802.11af: A standard for TV white
space spectrum sharing. IEEE Commun. Mag. 2013, 51, 92–100.
124. Bellalta, B. IEEE 802.11ax: High-efficiency WLANS. IEEE Wirel. Commun. 2016, 23, 38–46.
125. Merenda, M.; Iero, D.; Della Corte, F.G. CMOS RF Transmitters with On-Chip Antenna for Passive RFID
and IoT Nodes. Electron. 2019, 8, 1448.
126. Lazaro, A.; Villarino, R.; Girbau, D. A Survey of NFC Sensors Based on Energy Harvesting for IoT
Applications. Sensors 2018, 18, 3746.
127. LoRa Alliance® Website. Available online: https://lora-alliance.org/ (accessed on 11 March 2020).
128. Chiani, M.; Elzanaty, A. On the LoRa Modulation for IoT: Waveform Properties and Spectral Analysis.
IEEE Internet Things J. 2019, 6, 8463–8470.
129. Augustin, A.; Yi, J.; Clausen, T.H.; Townsley, W.M. A Study of LoRa: Long Range & Low Power Networks
for the Internet of Things. Sensors 2016, 16, 1466.
Sensors 2020, 20, 2533 29 of 33
130. Suresh, V.M.; Sidhu, R.; Karkare, P.; Patil, A.; Lei, Z.; Basu, A. Powering the IoT through embedded
machine learning and LoRa. In Proceedings of the IEEE World Forum on Internet of Things, WF-IoT,
Singapore, 5–8 February 2018.
131. Merenda, M.; Felini, C.; Della Corte, F.G. A Monolithic Multisensor Microchip with Complete On-Chip RF
Front-End. Sensors 2018, 18, 110.
132. Sigfox—The Global Communications Service Provider for the Internet of Things (IoT). Available online:
https://www.sigfox.com/en (accessed on 5 January 2020).
133. Huang, J.; Qian, F.; Guo, Y.; Zhou, Y.; Xu, Q.; Mao, Z.M.; Sen, S.; Spatscheck, O. An in-depth study of LTE:
Effect of network protocol and application behavior on performance. Comput. Commun. Rev. 2013, 43, 363–
374.
134. Akpakwu, G.A.; Silva, B.J.; Hancke, G.P.; Abu-Mahfouz, A.M. A Survey on 5G Networks for the Internet
of Things: Communication Technologies and Challenges. IEEE Access 2018, 6, 3619–3647.
135. Li, S.; Xu, L. Da; Zhao, S. 5G Internet of Things: A survey. J. Ind. Inf. Integr. 2018, 10, 1–9.
136. GSMA | Narrowband – Internet of Things (NB-IoT) | Internet of Things. Available online:
https://www.gsma.com/iot/narrow-band-internet-of-things-nb-iot/ (accessed on 11 March 2020).
137. Ratasuk, R.; Mangalvedhe, N.; Zhang, Y.; Robert, M.; Koskinen, J.P. Overview of narrowband IoT in LTE
Rel-13. In Proceedings of the 2016 IEEE Conference on Standards for Communications and Networking
(CSCN), Berlin, Germany, 31 October–2 November 2016.
138. Borkar, S.R. Long-term evolution for machines (LTE-M). LPWAN Technol. IoT M2M Appl. 2020, 145–166.
139. Wang, D.; Chen, D.; Song, B.; Guizani, N.; Yu, X.; Du, X. From IoT to 5G I-IoT: The Next Generation IoT-
Based Intelligent Algorithms and 5G Technologies. IEEE Commun. Mag. 2018, 56, 114–120.
140. Morocho-Cayamcela, M.E.; Lee, H.; Lim, W. Machine learning for 5G/B5G mobile and wireless
communications: Potential, limitations, and future directions. IEEE Access 2019, 7, 137184–137206.
141. Al-Sarawi, S.; Anbar, M.; Alieyan, K.; Alzubaidi, M. Internet of Things (IoT) communication protocols:
Review. In Proceedings of the ICIT 2017—8th International Conference on Information Technology,
Amman, Jordan, 17–18 May 2017.
142. Mahmoud, M.S.; Mohamad, A.A.H. A Study of Efficient Power Consumption Wireless Communication
Techniques/ Modules for Internet of Things (IoT) Applications. Adv. Internet Things 2016.
143. Mekki, K.; Bajic, E.; Chaxel, F.; Meyer, F. A comparative study of LPWAN technologies for large-scale IoT
deployment. ICT Express 2019, 5, 1–7.
144. Choi, C.S.; Jeong, J.D.; Lee, I.W.; Park, W.K. LoRa based renewable energy monitoring system with open
IoT platform. In Proceedings of the International Conference on Electronics, Information and
Communication, ICEIC, Honolulu, HI, USA, 24–27 January 2018.
145. Zhou, Q.; Zheng, K.; Hou, L.; Xing, J.; Xu, R. Design and Implementation of Open LoRa for IoT. IEEE Access
2019, 7, 100649–100657.
146. Wang, S.Y.; Chen, Y.R.; Chen, T.Y.; Chang, C.H.; Cheng, Y.H.; Hsu, C.C.; Lin, Y.B. Performance of LoRa-
based IoT applications on campus. In Proceedings of the IEEE Vehicular Technology Conference, Toronto,
ON, Canada, 24–27 September 2017.
147. Sarker, V.K.; Queralta, J.P.; Gia, T.N.; Tenhunen, H.; Westerlund, T. A survey on LoRa for IoT: Integrating
edge computing. In Proceedings of the 2019 4th International Conference on Fog and Mobile Edge
Computing, FMEC, Rome, Italy, 10–13 June 2019.
148. Poursafar, N.; Alahi, M.E.E.; Mukhopadhyay, S. Long-range wireless technologies for IoT applications: A
review. In Proceedings of the International Conference on Sensing Technology, ICST, Sydney, Australia,
4–6 December 2017.
149. Vejlgaard, B.; Lauridsen, M.; Nguyen, H.; Kovacs, I.Z.; Mogensen, P.; Sorensen, M. Coverage and Capacity
Analysis of Sigfox, LoRa, GPRS, and NB-IoT. In Proceedings of the IEEE Vehicular Technology Conference,
Sydney, Australia, 4–7 June 2017.
150. Ray, B. NB-IoT vs. LoRa vs. Sigfox, Available online: https://www.link-labs.com/blog/nb-iot-vs-lora-vs-
sigfox (accessed on 16 April 2020).
151. Zuniga, J.C.; Ponsard, B. Sigfox System Description. Ietf 97, Available online:
https://datatracker.ietf.org/meeting/97/materials/slides-97-lpwan-25-sigfox-system-description-00
(accessed on 16 April 2020).
Sensors 2020, 20, 2533 30 of 33
152. Froiz-Míguez, I.; Fernandez-Carames, T.M.; Fraga-Lamas, P.; Castedo, L. Design, Implementation and
Practical Evaluation of an IoT Home Automation System for Fog Computing Applications Based on MQTT
and ZigBee-WiFi Sensor Nodes. Sensors 2018, 18, 2660.
153. Ergen, S.C. ZigBee/IEEE 802.15.4 Summary. UC Berkeley September 2004. Available Online:
http://users.eecs.northwestern.edu/~peters/references/ZigtbeeIEEE802.pdf (accessed on 16 April 2020).
154. Li, Y.; Chi, Z.; Liu, X.; Zhu, T. Passive-ZigBee: Enabling zigbee communication in IoT networks with 1000X+
less power consumption. In Proceedings of the SenSys 2018—Proceedings of the 16th Conference on
Embedded Networked Sensor Systems, Shenzhen, China, 4–7 November 2018.
155. Patil, S.M.; Moiz Baig, M., Survey on Creating ZigBee Chain Reaction Using IoT. Int. J. Sci. Res. Comput. Sci.
Eng. Inf. Technol. 2018, 3, 545–549.
156. Hersent, O.; Boswarthick, D.; Elloumi, O. Z-Wave. In The Internet of Things: Key Applications and Protocols;
John Wiley & Sons: Hoboken, NJ, USA, 2011; ISBN 9781119994350.
157. Raza, S.; Misra, P.; He, Z.; Voigt, T. Building the Internet of Things with bluetooth smart. Ad Hoc Networks
2017, 57, 19–31.
158. Cha, S.-C.; Chen, J.-F.; Su, C.; Yeh, K.-H. A Blockchain Connected Gateway for BLE-Based Devices in the
Internet of Things. IEEE Access 2018, 6, 24639–24649.
159. Jeon, K.E.; She, J.; Soonsawad, P.; Ng, P.C. BLE Beacons for Internet of Things Applications: Survey,
Challenges, and Opportunities. IEEE Internet Things J. 2018, 5, 811–828.
160. Collotta, M.; Pau, G.; Talty, T.; Tonguz, O.K. Bluetooth 5: A Concrete Step Forward toward the IoT. IEEE
Commun. Mag. 2018, 56, 125–131.
161. Ray, P.P.; Agarwal, S. Bluetooth 5 and Internet of Things: Potential and architecture. In Proceedings of the
International Conference on Signal Processing, Communication, Power and Embedded System, SCOPES
2016, Paralakhemundi, India, 3–5 October 2016.
162. Pau, G.; Collotta, M.; Maniscalco, V. Bluetooth 5 Energy Management through a Fuzzy-PSO Solution for
Mobile Devices of Internet of Things. Energies 2017, 10, 992.
163. López-Matencio, P.; Vales-Alonso, J.; Costa-Montenegro, E. ANT: Agent Stigmergy-Based IoT-Network for
Enhanced Tourist Mobility. Mob. Inf. Syst. 2017, 2017, 1–15.
164. Shrivastava, V.; Rayanchu, S.; Yoon, J.; Banerjee, S. 802.11n under the microscope. In Proceedings of the
ACM SIGCOMM Internet Measurement Conference, IMC, Vouliagmeni, Greece, 20–22 October 2008.
165. IEEE. IEEE Std 802.11ah-2016 (Amendment to IEEE Std 802.11-2016, as Amended by IEEE Std 802.11ai-2016).
IEEE Standard for Information Technology--Telecommunications and Information Exchange between Systems—
Local and metropolitan Area Networks--Specific Requirements—Part 11: Wireless LAN Medium Access Control
(MAC) and Physical Layer (PHY) Specifications Amendment 2: Sub 1 GHz License Exempt Operation; IEEE
Computer Society: Washington, DC, USA, 14 December 2016; doi: 10.1109/IEEESTD.2016.7786995
166. Park, M. IEEE 802.11ah: sub-1-GHz license-exempt operation for the internet of things. IEEE Commun. Mag.
2015, 53, 145–151.
167. Hossain, M.I.; Lin, L.; Markendahl, J. A Comparative Study of IoT-Communication Systems Cost Structure:
Initial Findings of Radio Access Networks Cost. In Proceedings of the 11th CMI International Conference,
2018: Prospects and Challenges Towards Developing a Digital Economy within the EU, PCTDDE 2018,
Copenhagen, Denmark, 29–30 November 2018.
168. Chen, M.; Miao, Y.; Hao, Y.; Hwang, K. Narrow Band Internet of Things. IEEE Access 2017, 5, 20557–20577.
169. Sara, J.J.; Hossain, M.S.; Khan, W.Z.; Aalsalem, M.Y. Survey on Internet of Things and 4G. In Proceedings
of the 2019 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications
(ICRAMET), Tangerang, Indonesia, 23–24 October 2019; pp. 1–6.
170. Gemalto Introducing 5G networks—Characteristics and Usages. 2016. Available online:
Https://www.Gemalto.Com (accessed on 16 April 2020).
171. Martinez, I.S.H.; Salcedo, I.P.O.J.; Daza, I.B.S.R. IoT application of WSN on 5G infrastructure. In
Proceedings of the 2017 International Symposium on Networks, Computers and Communications, ISNCC,
Marrakech, Morocco, 16–18 May 2017.
172. Mumtaz, S.; Bo, A.; Al-Dulaimi, A.; Tsang, K.F. Guest Editorial 5G and beyond Mobile Technologies and
Applications for Industrial IoT (IIoT). IEEE Trans. Ind. Inf. 2018, 14,2588–2591.
173. Kang, Y.; Hauswald, J.; Gao, C.; Rovinski, A.; Mudge, T.; Mars, J.; Tang, L. Neurosurgeon: Collaborative
intelligence between the cloud and mobile edge. In Proceedings of the Twenty-Second International
Sensors 2020, 20, 2533 31 of 33
Conference on Architectural Support for Programming Languages and Operating Systems, Xi’an China,
8–12 April 2017; pp. 615–629,
174. Teerapittayanon, S.; McDanel, B.; Kung, H.T. Distributed Deep Neural Networks over the Cloud, the Edge
and End Devices. In Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing
Systems (ICDCS), Atlanta, GA, USA, 5–8 June 2017; pp. 328–339.
175. Wang, J.; Zhu, X.; Zhang, J.; Cao, B.; Bao, W.; Yu, P.S. Not just privacy: Improving performance of private
deep learning in mobile cloud. In Proceedings of the 24th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2407–2416.
176. Du, M.; Wang, K.; Xia, Z.; Zhang, Y. Differential Privacy Preserving of Training Model in Wireless Big Data
with Edge Computing. IEEE Trans. Big Data 2018, doi:10.1109/TBDATA.2018.2829886.
177. Abadi, M.; McMahan, H.B.; Chu, A.; Mironov, I.; Zhang, L.; Goodfellow, I.; Talwar, K. Deep learning with
differential privacy. In Proceedings of the ACM Conference on Computer and Communications Security,
Vienna, Austria, 25–27 October 2016.
178. Xu, C.; Ren, J.; Zhang, D.; Zhang, Y. Distilling at the Edge: A Local Differential Privacy Obfuscation
Framework for IoT Data Analytics. IEEE Commun. Mag. 2018, 56, 20–25.
179. Miao, Q.; Jing, W.; Song, H. Differential privacy–based location privacy enhancing in edge computing.
Concurr. Comput. Pr. Exp. 2018, 31, e4735.
180. Dowlin, N.; Edu, N.; Gilad-Bachrach, R.; Laine, K.; Lauter, K.; Naehrig, M.; Wernsing, J.; Com, J.W.
CryptoNets: Applying neural networks to Encrypted data with high throughput and accuracy—Microsoft
research. Microsoft Res. TechReport 2016, 48, 1–12.
181. Dias, M.; Abad, A.; Trancoso, I. Exploring Hashing and Cryptonet Based Approaches for Privacy-
Preserving Speech Emotion Recognition. In Proceedings of the ICASSP, IEEE International Conference on
Acoustics, Speech and Signal Processing, Calgary, AB, Canada, 15–20 April 2018.
182. Morris, A.; Mellis, C. Privacy-preserving classifification on deep neural network. IACR Cryptol. ePrint Arch.
2017, 2017, 35.
183. Mao, J.; Chen, X.; Nixon, K.W.; Krieger, C.; Chen, Y. MoDNN: Local distributed mobile computing system
for Deep Neural Network. In Proceedings of the Design, Automation & Test in Europe Conference &
Exhibition (DATE), 2017, Lausanne, Switzerland, 27–31 March 2017; pp. 1396–1401.
184. Chandakkar, P.S.; Li, Y.; Ding, P.L.K.; Li, B. Strategies for Re-Training a Pruned Neural Network in an Edge
Computing Paradigm. In Proceedings of the 2017 IEEE International Conference on Edge Computing
(EDGE), Honolulu, HI, USA, 25–30 June 2017; pp. 244–247.
185. Tao, Z.; Li, Q. eSGD: Communication efficient distributed deep learning on the edge. In Proceedings of the
USENIX Workshop on Hot Topics in Edge Computing, HotEdge 2018, Co-Located with USENIX ATC 2018,
Boston, MA, USA, 10 July 2018.
186. Deep Learning Hardware: FPGA vs. GPU. Available online: https://semiengineering.com/deep-learning-
hardware-fpga-vs-gpu/ (accessed on 18 February 2020).
187. cuBLAS | NVIDIA Developer. Available online: https://developer.nvidia.com/cublas (accessed on 21
February 2020).
188. NVIDIA cuDNN | NVIDIA Developer. Available online: https://developer.nvidia.com/cudnn (accessed on
21 February 2020).
189. Mathieu, M.; Henaff, M.; LeCun, Y. Fast Training of Convolutional Networks through FFTs. arXiv 2013,
arXiv:1312.5851; pp. 1–9.
190. Cong, J.; Xiao, B. Minimizing in Convolutional Neural Networks. Int. Conf. Artif. Neural Networks 2014,
8681, 281–290.
191. Lavin, A.; Gray, S. Fast Algorithms for Convolutional Neural Networks. In Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30
June 2016.
192. Hailo–Top Performing AI Chip for Edge Devices. Available online: https://hailo.ai/ (accessed on 18
November 2019).
193. Cloud TPU | Google Cloud. Available online: https://cloud.google.com/tpu/ (accessed on 18 November
2019).
194. US8655815B2—Neural processing unit—Google Patents. Available online:
https://patents.google.com/patent/US8655815B2/en (accessed on 6 March 2020).
Sensors 2020, 20, 2533 32 of 33
195. Farabet, C.; Poulet, C.; Han, J.Y.; LeCun, Y. CNP: An FPGA-based processor for Convolutional Networks.
In Proceedings of the 2009 International Conference on Field Programmable Logic and Applications,
Prague, Czech, 31 August–2 September 2009; Volume 1, pp. 32–37.
196. Farabet, C.; Martini, B.; Corda, B.; Akselrod, P.; Culurciello, E.; LeCun, Y. NeuFlow: A runtime
reconfigurable dataflow processor for vision. In Proceedings of the CVPR 2011 WORKSHOPS, Colorado
Springs, CO, USA, 20–25 June 2011; pp. 109–116.
197. Chakradhar, S.; Sankaradas, M.; Jakkula, V.; Cadambi, S. A dynamically configurable coprocessor for
convolutional neural networks. In Proceedings of the 37th Annual International Symposium on Computer
Architecture, Saint-Malo, France, 19–23 June 2010; pp. 247–257.
198. Nuño-Maganda, M.; Torres-Huitzil, C. A temporal coding hardware implementation for spiking neural
networks. ACM SIGARCH Comput. Archit. News 2011, 38, 2.
199. Chen, Y.; Luo, T.; Liu, S.; Zhang, S.; He, L.; Wang, J.; Li, L.; Chen, T.; Xu, Z.; Sun, N.; et al. DaDianNao: A
Machine-Learning Supercomputer. In Proceedings of the 2014 47th Annual IEEE/ACM International
Symposium on Microarchitecture, Cambridge, UK, 13–17 December 2014; pp. 609–622.
200. Du, Z.; Fasthuber, R.; Chen, T.; Ienne, P.; Li, L.; Luo, T.; Feng, X.; Chen, Y.; Temam, O. ShiDianNao: Shifting
vision processing closer to the sensor. In Proceedings of the 42nd Annual International Symposium on
Computer Architecture, Portland, Oregon, 13–17 June 2015; pp. 92–104.
201. Intel® Neural Compute Stick 2 | Intel® Software. Available online: https://software.intel.com/en-
us/neural-compute-stick (accessed on 16 April 2020).
202. Othman, N.A.; Aydin, I. A New Deep Learning Application Based on Movidius NCS for Embedded Object
Detection and Recognition. In Proceedings of the 2018 2nd International Symposium on Multidisciplinary
Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 19–21 October 2018.
203. Get Started with Intel® Neural Compute Stick 2 | Intel® Software. Available online:
https://software.intel.com/en-us/articles/get-started-with-neural-compute-stick (accessed on 16 April
2020).
204. Coral. Available online: https://www.coral.ai/ (accessed on 16 April 2020).
205. Examples | Coral. Available online: https://coral.ai/examples/ (accessed on 16 April 2020).
206. Hochstetler, J.; Padidela, R.; Chen, Q.; Yang, Q.; Fu, S. Embedded deep learning for vehicular edge
computing. In Proceedings of the 2018 3rd ACM/IEEE Symposium on Edge Computing, SEC 2018,
Bellevue, WA, USA, 25–27 October 2018.
207. Marantos, C.; Karavalakis, N.; Leon, V.; Tsoutsouras, V.; Pekmestzi, K.; Soudris, D. Efficient support vector
machines implementation on Intel/Movidius Myriad 2. In Proceedings of the 2018 7th International
Conference on Modern Circuits and Systems Technologies, MOCAST 2018, Thessaloniki, Greece, 7–9 May
2018.
208. Barry, B.; Brick, C.; Connor, F.; Donohoe, D.; Moloney, D.; Richmond, R.; O’Riordan, M.; Toma, V.; Nicholls,
D. Always-on Vision Processing Unit for Mobile Applications. IEEE Micro 2015, 35, 56–66.
209. Liu, Q.; Huang, S.; Han, T. Demo: Fast and accurate object analysis at the edge for mobile augmented
reality. In Proceedings of the 2017 2nd ACM/IEEE Symposium on Edge Computing, SEC 2017, San Jose,
CA, USA, 12–14 October 2017.
210. Lee, S.; Son, K.; Kim, H.; Park, J. Car plate recognition based on CNN using embedded system with GPU.
In Proceedings of the 2017 10th International Conference on Human System Interactions, HSI 2017, Ulsan,
Korea, 17–19 July 2017.
211. Ezra Tsur, E.; Madar, E.; Danan, N. Code generation of graph-based vision processing for multiple CUDA
Cores SoC Jetson TX. In Proceedings of the 2018 IEEE 12th International Symposium on Embedded
Multicore/Many-Core Systems-on-Chip, MCSoC 2018, Hanoi, Vietnam, 12–14 September 2018.
212. Rungsuptaweekoon, K.; Visoottiviseth, V.; Takano, R. Evaluating the power efficiency of deep learning
inference on embedded GPU systems. In Proceedings of the Proceeding of 2017 2nd International
Conference on Information Technology, INCIT 2017, Nakhonpathom, Thailand, 2–3 November 2017.
213. Chinchali, S.; Sharma, A.; Harrison, J.; Elhafsi, A.; Kang, D.; Pergament, E.; Cidon, E.; Katti, S.; Pavone, M.
Network Offloading Policies for Cloud Robotics: A Learning-Based Approach. In Proceedings of Robotics:
Science and Systems 2019, Freiburg im Breisgau, 22–26 June 2019.
214. Jana, A.P.; Biswas, A. Mohana YOLO based detection and classification of objects in video records. In
Proceedings of the 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information
and Communication Technology, RTEICT 2018, Bangalore, India, 18–19 May 2018.
Sensors 2020, 20, 2533 33 of 33
215. Chen, T.; Du, Z.; Sun, N.; Wang, J.; Wu, C.; Chen, Y.; Temam, O. DianNao: A small-footprint high-
throughput accelerator for ubiquitous machine-learning. In Proceedings of the International Conference on
Architectural Support for Programming Languages and Operating Systems—ASPLOS, Salt Lake City, UT,
USA, 1–5 March 2014.
216. MNIST Handwritten Digit Database, Yann LeCun, Corinna Cortes and Chris Burges. Available online:
http://yann.lecun.com/exdb/mnist/ (accessed on 16 April 2020).
217. National Institute of Standards and Technology | NIST. Available online: https://www.nist.gov/ (accessed
on 8 January 2020).
218. Cook, L.T.; Zhu, Y.; Hall, T.J. Bioelasticity imaging: II. Spatial resolution. Med. Imaging 2000, 3982, 315–325.
219. Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In
Proceedings of the Journal of Machine Learning Research, 2010, 13th International Conference on Artificial
Intelligence and Statistics, AISTATS 2010, Sardinia, Italy, 13–15 May 2010.
220. TensorFlow Lite models | TensorFlow. Available online: https://www.tensorflow.org/lite/models
(accessed on 8 January 2020).
221. Sequential—Keras Documentation. Available online: https://keras.io/models/sequential/ (accessed on 8
January 2020).
222. TensorFlow Lite inference. Available online: https://www.tensorflow.org/lite/guide/inference (accessed on
16 April 2020).
223. FlatBuffers: FlatBuffers. Available online: https://google.github.io/flatbuffers/ (accessed on 8 January 2020).
224. Magnitude-based weight pruning with Keras. Available online:
https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras (accessed on 8
January 2020).
225. Python Imaging Library (PIL). Available online: https://pythonware.com/products/pil/ (accessed on 8
January 2020).
226. TkInter—Python Wiki. Available online: https://wiki.python.org/moin/TkInter (accessed on 8 January
2020).
227. NUCLEO-F746ZG—STM32 Nucleo-144 Development Board with STM32F746ZG MCU, Supports
Arduino, ST Zio and Morpho Connectivity—STMicroelectronics. Available online:
https://www.st.com/en/evaluation-tools/nucleo-f746zg.html (accessed on 15 March 2020).
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).