0% found this document useful (0 votes)
10 views12 pages

IoT and ML Question

The article discusses the implementation of machine learning (ML) on Internet of Things (IoT) devices at the edge, using a smart traffic intersection as an example. It explains the importance of edge processing due to limitations in cloud processing, such as latency and reliability, and outlines two methods for deploying ML models: custom code and the Microsoft Embedded Learning Library (ELL). The article also provides insights into the structure of ML models, including neural networks, and emphasizes the need for custom solutions when deploying on resource-constrained devices.

Uploaded by

emonikram1159
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

IoT and ML Question

The article discusses the implementation of machine learning (ML) on Internet of Things (IoT) devices at the edge, using a smart traffic intersection as an example. It explains the importance of edge processing due to limitations in cloud processing, such as latency and reliability, and outlines two methods for deploying ML models: custom code and the Microsoft Embedded Learning Library (ELL). The article also provides insights into the structure of ML models, including neural networks, and emphasizes the need for custom solutions when deploying on resource-constrained devices.

Uploaded by

emonikram1159
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

Article01/04/2019

July 2018

Volume 33 Number 7

[Machine Learning]

Machine Learning with IoT Devices on the Edge


By James McCaffrey

Imagine that,in the not too distant future, you’re the designer of a smart traffic intersection. Your smart intersection has four video cameras connected to an
Internet of things (IoT) device with a small CPU, similar to a Raspberry Pi. The cameras send video frames to the IoT device, where they’re analyzed using a
machine learning (ML) image-recognition model and control instructions are then sent to the traffic signals. One of the small IoT devices is connected to Azure
Cloud Services, where information is logged and analyzed offline.

This is an example of ML on an IoT device on the edge. I use the term edge device to mean anything connected to the cloud, where cloud refers to something
like Microsoft Azure or a company’s remote servers. In this article, I’ll explain two ways you can design ML on the edge. Specifically, I’ll describe how to write a
custom model and IO function for a device, and how to use the Microsoft Embedded Learning Library (ELL) set of tools to deploy an optimized ML model to a
device on the edge. The custom IO approach is currently, as I write this article, the most common way to deploy an ML model to an IoT device. The ELL
approach is forward-looking.

Even if you’re not working with ML on IoT devices, there are at least three reasons why you might want to read this article. First, the design principles involved
generalize to other software development scenarios. Second, it’s quite possible that you’ll be working with ML and IoT devices relatively soon. Third, you may
just find the techniques described here interesting in their own right.

Why does ML need to be on the IoT edge? Why not just do all processing in the cloud? IoT devices on the edge can be very inexpensive, but they often have
limited memory, limited processing capability and a limited power supply. In many scenarios, trying to perform ML processing in the cloud has several
drawbacks.

Latency is often a big problem. In the smart traffic intersection example, a delay of more than a fraction of a second could have disastrous consequences.
Additional problems with trying to perform ML in the cloud include reliability (a dropped network connection is typically impossible to predict and difficult to
deal with), network availability (for example, a ship at sea may have connectivity only when a satellite is overhead) and privacy/security (when, for example,
you’re monitoring a patient in a hospital.)

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 1/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

This article doesn’t assume you have any particular background or skill set but does assume you have some general software development experience. The
demo programs described in this article (a Python program that uses the CNTK library to create an ML model, a C program that simulates IoT code and a
Python program that uses an ELL model) are too long to present here, but they’re available in the accompanying file download.

What Is a Machine Learning Model?


In order to understand the issues with deploying an ML model to an IoT device on the edge, you must understand exactly what an ML model is. Very loosely
speaking, an ML model is all the information needed to accept input data, make a prediction and generate output data. Rather than try to explain in the
abstract, I’ll illustrate the ideas using a concrete example.

Take a look at the screenshot in Figure 1 and the diagram in Figure 2. The two figures show a neural network with four input nodes, five hidden layer processing
nodes and three output layer nodes. The input values are (6,1, 3.1, 5.1, 1.1) and the output values are (0.0321, 0.6458, 0.3221). Figure 1 shows how the model
was developed and trained. I used Visual Studio Code, but there are many alternatives.

Creating and Training a Neural Network Model


Figure 1 Creating and Training a Neural Network Model

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 2/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

Figure 2 The Neural Network Input-Output Mechanism

This particular example involves predicting the species of an iris flower using input values that represent sepal (a leaf-like structure) length and width and petal
length and width. There are three possible species of flower: setosa, versicolor, virginica. The output values can be interpreted as probabilities (note that they
sum to 1.0) so, because the second value, 0.6458, is largest, the model’s prediction is the second species, versicolor.

In Figure 2, each line connecting a pair of nodes represents a weight. A weight is just a numeric constant. If nodes are zero-base indexed, from top to bottom,
the weight from input[0] to hidden[0] is 0.2680 and the weight from hidden[4] to output[0] is 0.9381.

Each hidden and output node has a small arrow pointing into the node. These are called biases. The bias for hidden[0] is 0.1164 and the bias for output[0] is
-0.0466.

You can think of a neural network as a complicated math function because it just accepts numeric input and produces numeric output. An ML model on an IoT
device needs to know how to compute output. For the neural network in Figure 2, the first step is to compute the values of the hidden nodes. The value of each
hidden node is the hyperbolic tangent (tanh) function applied to the sum of the products of inputs and associated weights, plus the bias. For hidden[0] the
calculation is:

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 3/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

XML

hidden[0] = tanh((6.1 * 0.2680) + (3.1 * 0.3954) +


(5.1 * -0.5503) + (1.1 * -0.3220) + 0.1164)
= tanh(-0.1838)
= -0.1817

Hidden nodes [1] through [4] are calculated similarly. The tanh function is called the hidden layer activation function. There are other activation functions that
can be used, such as logistic sigmoid and rectified linear unit, which would give different hidden node values.

After the hidden node values have been computed, the next step is to compute preliminary output node values. A preliminary output node value is just the sum
of products of hidden nodes and associated hidden-to-output weights, plus the bias. In other words, the same calculation as used for hidden nodes, but without
the activation function. For the preliminary value of output[0] the calculation is:

XML

o_pre[0] = (-0.1817 * 0.7552) + (-0.0824 * -0.7297) +


(-0.1190 * -0.6733) + (-0.9287 * 0.9367) +
(-0.9081 * 0.9381) + (-0.0466)
= -1.7654

The values for output nodes [1] and [2] are calculated in the same way. After the preliminary values of the output nodes have been computed, the final output
node values can be converted to probabilities using the softmax activation function. The softmax function is best explained by example. The calculations for the
final output values are:

XML

sum = exp(o_pre[0]) + exp(o_pre[1]) + exp(o_pre[2])


= 0.1711 + 3.4391 + 1.7153
= 5.3255
output[0] = exp(o_pre[0]) / sum
= 0.1711 / 5.3255 = 0.0321
output[1] = exp(o_pre[1]) / sum
= 3.4391 / 5.3255 = 0.6458
output[2] = exp(o_pre[2]) / sum
= 1.7153 / 5.3255 = 0.3221

As with the hidden nodes, there are alternative output node activation functions, such as the identity function.

To summarize, an ML model is all the information needed to accept input data and generate an output prediction. In the case of a neural network, this
information consists of the number of input, hidden and output nodes, the values of the weights and biases, and the types of activation functions used on the

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 4/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

hidden and output layer nodes.

OK, but where do the values of the weights and the biases come from? They’re determined by training the model. Training is using a set of data that has known
input values and known, correct output values, and applying an optimization algorithm such as back-propagation to minimize the difference between
computed output values and known, correct output values.

There are many other kinds of ML models, such as decision trees and naive Bayes, but the general principles are the same. When using a neural network code
library such as Microsoft CNTK or Google Keras/TensorFlow, the program that trains an ML model will save the model to disk. For example, CNTK and Keras
code resembles:

XML

mp = ".\\Models\\iris_nn.model"
model.save(mp, format=C.ModelFormat.CNTKv2) # CNTK
model.save(".\\Models\\iris_model.h5") # Keras

ML libraries also have functions to load a saved model. For example:

XML

mp = ".\\Models\\iris_nn.model"
model = C.ops.functions.Function.load(mp) # CNTK
model = load_model(".\\Models\\iris_model.h5") # Keras

Most neural network libraries have a way to save just a model’s weights and biases values to file (as opposed to the entire model).

Deploying a Standard ML Model to an IoT Device


The image in Figure 1 shows an example of what training an ML model looks like. I used Visual Studio Code as the editor and the Python language API interface
to the CNTK v2.4 library. Creating a trained ML model can take days or weeks of effort, and typically requires a lot of processing power and memory. Therefore,
model training is usually performed on powerful machines, often with one or more GPUs. Additionally, as the size and complexity of a neural network increases,
the number of weights and biases increases dramatically, and so the file size of a saved model also increases greatly.

For example, the 4-5-3 iris model described in the previous section has only (4 * 5) + 5 + (5 * 3) + 3 = 43 weights and biases. But an image classification model
with millions of input pixel values and hundreds of hidden processing nodes can have hundreds of millions, or even billions, of weights and biases. Notice that
the values of all 43 weights and biases of the iris example are shown in Figure 1:

XML

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 5/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

[[ 0.2680 -0.3782 -0.3828 0.1143 0.1269]


[ 0.3954 -0.4367 -0.4332 0.3880 0.3814]
[-0.5503 0.6453 0.6394 -0.6454 -0.6300]
[-0.322 0.4035 0.4163 -0.3074 -0.3112]]
[ 0.1164 -0.1567 -0.1604 0.0810 0.0822]
[[ 0.7552 -0.0001 -0.7706]
[-0.7297 -0.2048 0.9301]
[-0.6733 -0.2512 0.9167]
[ 0.9367 -0.4276 -0.5134]
[ 0.9381 -0.3728 -0.5667]]
[-0.0466 0.4528 -0.4062]

So, suppose you have a trained ML model. You want to deploy the model to a small, weak, IoT device. The simplest solution is to install onto the IoT device the
same neural network library software you used to train the model. Then you can copy the saved trained model file to the IoT device and write code to load the
model and make a prediction. Easy!

Unfortunately, this approach will work only in relatively rare situations where your IoT device is quite powerful—perhaps along the lines of a desktop PC or
laptop. Also, neural network libraries such as CNTK and Keras/TensorFlow were designed to train models quickly and efficiently, but in general they were not
necessarily designed for optimal performance when performing input-output with a trained model. In short, the easy solution for deploying a trained ML model
to an IoT device on the edge is rarely feasible.

The Custom Code Solution


Based on my experience and conversations with colleagues, the most common way to deploy a trained ML model to an IoT device on the edge is to write
custom C/C++ code on the device. The idea is that C/C++ is almost universally available on IoT devices, and C/C++ is typically fast and compact. The demo
program in Figure 3 illustrates the concept.

Simulation of Custom C/C++ IO Code on an IoT Device


Figure 3 Simulation of Custom C/C++ IO Code on an IoT Device

The demo program starts by using the gcc C/C++ tool to compile file test.c into an executable on the target device. Here, the target device is just my desktop
PC but there are C/C++ compilers for almost every kind of IoT/CPU device. When run, the demo program displays the values of the weights and biases of the
iris flower example, then uses input values of (6.1, 3.1, 5.1, 1.1) and computes and displays the output values (0.0321, 0.6458, 0.3221). If you compare Figure 3
with Figures 1 and 2, you’ll see the inputs, weights and biases, and outputs are the same (subject to rounding error).

Demo program test.c implements only the neural network input-output process. The program starts by setting up a struct data structure to hold the number of
nodes in each layer, values for the hidden and output layer nodes, and values of the weights and biases:

c++

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 6/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

#include <stdio.h>
#include <stdlib.h>
#include <math.h> // Has tanh()
typedef struct {
int ni, nh, no;
float *h_nodes, *o_nodes; // No i_nodes
float **ih_wts, **ho_wts;
float *h_biases, *o_biases;
} nn_t;

The program defines the following functions:

c++

construct(): initialize the struct


free(): deallocate memory when done
set_weights(): assign values to weights and biases
softmax(): the softmax function
predict(): implements the NN IO mechanism
show_weights(): a display helper

The key lines of code in the demo program main function look like:

c++

nn_t net; // Neural net struct


construct(&net, 4, 5, 3); // Instantiate the NN
float wts[43] = { // specify the weights and biases
0.2680, -0.3782, -0.3828, 0.1143, 0.1269,
. . .
-0.0466, 0.4528, -0.4062 };
set_weights(&net, wts); // Copy values into NN
float inpts[4] = { 6.1, 3.1, 5.1, 1.1 }; // Inputs
int shownodes = 0; // Don’t show
float* probs = predict(net, inpts, shownodes);

The point is that if you know exactly how a simple neural network ML model works, the IO process isn’t magic. You can implement basic IO quite easily.

The main advantage of using a custom C/C++ IO function is conceptual simplicity. Also, because you’re coding at a very low level (really just one level of
abstraction above assembly language), the generated executable code will typically be very small and run very fast. Additionally, because you have full control
over your IO code, you can use all kinds of tricks to speed up performance or reduce memory footprint. For example, program test.c uses type float but,
depending on the problem scenario, you might be able to use a custom 16-bit fixed-point data type.

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 7/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

The main disadvantage of using a custom C/C++ IO approach is that the technique becomes increasingly difficult as the complexity of the trained ML model
increases. For example, an IO function for a single hidden layer neural network with tanh and softmax activation is very easy to implement—taking only about
one day to one week of development effort, depending on many factors, of course. A deep neural network with several hidden layers is somewhat easy to deal
with—maybe a week or two of effort. But implementing the IO functionality of a convolutional neural network (CNN) or a long, short-term memory (LSTM)
recurrent neural network is very difficult and would typically require much more than four weeks of development effort.

I suspect that as the use of IoT devices increases, there will be efforts to create open source C/C++ libraries that implement the IO for ML models created by
different neural network libraries such as CNTK and Keras/TensorFlow. Or, if there’s enough demand, the developers of neural network libraries might create
C/C++ IO APIs for IoT devices themselves. If you had such a library, writing custom IO for an IoT device would be relatively simple.

The Microsoft Embedded Learning Library


The Microsoft Embedded Learning Library (ELL) is an ambitious open source project intended to ease the development effort required to deploy an ML model
to an IoT device on the edge (microsoft.github.io/ELL). The basic idea of ELL is illustrated on the left side of Figure 4.

Figure 4 The ELL Workflow Process, High-Level and Granular

In words, the ELL system accepts an ML model created by a supported library, such as CNTK, or a supported model format, such as open neural network
exchange (ONNX). The ELL system uses the input ML model and generates an intermediate model as an .ell file. Then the ELL system uses the intermediate .ell

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 8/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

model file to generate executable code of some kind for a supported target device. Put another way, you can think of ELL as a sort of cross-compiler for ML
models.

A more granular explanation of how ELL works is shown on the right side of Figure 4, using the iris flower model example. The process starts with an ML
developer writing a Python program named iris_nn.py to create and save a prediction model named iris_cntk.model, which is in a proprietary binary format. This
process is shown in Figure 1.

The ELL command-line tool cntk_import.py is then used to create an intermediate iris_cntk.ell file, which is stored in JSON format. Next, the ELL command-line
tool wrap.py is used to generate a directory host\build of C/C++ source code files. Note that “host” means to take the settings from the current machine, so a
more common scenario would be something like \pi3\build. Then the cmake.exe C/C++ compiler-build tool is used to generate a Python module of executable
code, containing the logic of the original ML model, named iris_cntk. The target could be a C/C++ executable or a C# executable or whatever is best-suited for
the target IoT device.

The iris_cntk Python module can then be imported by a Python program (use_iris_ell_model.py) on the target device (my desktop PC), as shown in Figure 5.
Notice that the input values (6.1, 3.1, 5.1, 1.1) and output values (0.0321, 0.6457, 0.3221) generated by the ELL system model are the same as the values
generated during model development (Figure 1) and the values generated by the custom C/C++ IO function (Figure 3).

Figure 5 Simulation of Using an ELL Model


on an IoT Device

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 9/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

The leading “(py36)” before the command prompts in Figure 5 indicate I’m working in a special Python setting called a Conda environment where I’m using
Python version 3.6, which was required at the time I coded my ELL demo.

The code for program use_iris_ell_model.py is shown in Figure 6. The point is that ELL has generated a Python module/package that can be used just like any
other package/module.

Figure 6 Using an ELL Model in a Python Program

XML

# use_iris_ell_model.py
# Python 3.6
import numpy as np
import tutorial_helpers # used to find package
import iris_cntk as m # the ELL module/package
print("\nBegin use ELL model demo \n")
unknown = np.array([[6.1, 3.1, 5.1, 1.1]],
dtype=np.float32)
np.set_printoptions(precision=4, suppress=True)
print("Input to ELL model: ")
print(unknown)
predicted = m.predict(unknown)
print("\nPrediction probabilities: ")
print(predicted)
print("\nEnd ELL demo \n"

The ELL system is still in the very early stages of development, but based on my experience, the system is ready for you to experiment with and is stable enough
for limited production development scenarios.

I expect your reaction to the diagram of the ELL process in Figure 4 and its explanation is something like, “Wow, that’s a lot of steps!” At least, that was my
reaction. Eventually, I expect the ELL system to mature to a point where you can generate a model for deployment to an IoT device along the lines of:

XML

source_model = ".\\iris_cntk.model"
target_model = ".\\iris_cortex_m4.model"
ell_generate(source_model, target_model)

But for now, if you want to explore ELL you’ll have to work with several steps. Luckily, the ELL tutorial from the ELL Web site on which much of this article is
based is very good. I should point out that to get started with ELL you must install ELL on your desktop machine, and installation consists of building C/C++
source code—there’s no .msi installer for ELL (yet).

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 10/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

A cool feature of ELL that isn’t obvious is that it performs some very sophisticated optimization behind the scenes. For example, the ELL team has explored ways
to compress large ML models, including sparsification and pruning techniques, and replacing floating point math with 1-bit math. The ELL team is also looking
at algorithms that can be used in place of neural networks, including improved decision trees and k-DNF classifiers.

The tutorials on the ELL Web site are quite good, but because there are many steps involved, they are a bit long. Let me briefly sketch out the process so you
can get a feel for what installing and using ELL is like. Note that my commands are not syntactically correct; they’re highly simplified to keep the main ideas
clear.

Installing the ELL system resembles:

XML

x> (install several tools such as cmake and BLAS)


> git clone https://github.com/Microsoft/ELL.git
> cd ELL
> nuget.exe restore external/packages.config -PackagesDirectory external
> md build
> cd build
> cmake -G "Visual Studio 15 2017 Win64" ..
> cmake --build . --config Release
> cmake --build . --target _ELL_python --config Release

In words, you must have quite a few tools installed before starting, then you pull the ELL source code down from GitHub and then build the ELL executable tools
and Python binding using cmake.

Creating an ELL model resembles:

XML

> python cntk_import.py iris_cntk.model


> python wrap.py iris_nn_cntk.ell --language python --target host
> cd host
> md build
> cd build
> cmake -G "Visual Studio 15 2017 Win64" .. && cmake --build . --config release

That is, you use ELL tool cntk_import.py to create a .ell file from a CNTK model file. You use wrap.py to generate a lot of C/C++ specific to a particular target IoT
device. And you use cmake to generate executables that encapsulate the original trained ML model’s behavior.

Wrapping Up
https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 11/12
2/1/24, 11:22 AM Machine Learning - Machine Learning with IoT Devices on the Edge | Microsoft Learn

To summarize, a machine learning model is all the information needed for a software system to accept input and generate a prediction. Because IoT devices on
the edge often require very fast and reliable performance, it’s sometimes necessary to compute ML predictions directly on a device. However, IoT devices are
often small and weak, so you can’t simply copy a model that was developed on a powerful desktop machine to the device. A standard approach is to write
custom C/C++ code, but this approach doesn’t scale to complex ML models. An emerging approach is the use of ML cross-compilers, such as the Microsoft
Embedded Learning Library.

When fully mature and released, the ELL system will quite likely make developing complex ML models for IoT devices on the edge dramatically easier than it is
today.

Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products, including Internet Explorer and Bing. Dr.
McCaffrey can be reached at [email protected].

Thanks to the following Microsoft technical experts who reviewed this article: Byron Changuion, Chuck Jacobs, Chris Lee and Ricky Loynd

Discuss this article in the MSDN Magazine forum

https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/july/machine-learning-machine-learning-with-iot-devices-on-the-edge 12/12

You might also like