0% found this document useful (0 votes)

25 views57 pages

Thesis - Anomaly Detection

Uploaded by

Muhammad Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views57 pages

Thesis - Anomaly Detection

Uploaded by

Muhammad Ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Applicability of TinyML for

maintenance predictability

Niklas Exell
Master’s thesis in Computer Engineering
Supervisor: Jerker Björkqvist
Åbo Akademi University
Faculty of Science and Engineering
Information Technologies
October, 2023
Abstract

Currently, a diverse set of systems are deployed where a lot of data

is or could be gathered in order to do some kind of analysis, either in
real-time or retroactively.
One type of analysis that could bring large savings is maintenance
prediction based on data gathered from a running machine. In a worst-
case scenario, it is obvious how detecting a fault in a machine early in
order to prevent a catastrophic failure of the machine can save a lot of
money. But also predicting smaller failures, which might ”only” cause
downtime, by predicting what kind of maintenance needs to be done
and therefore also having some foresight into what kinds of parts need
to be acquired in order to do the required maintenance as smoothly
as possible.
In order to do this we can use different kinds of strategies to do the
processing. Both when it comes to what kind of processing is done as
well as where it is done.
When it comes to what kind of processing is to be done it is
mostly in the form of some kind of machine learning. Either retroact-
ively, most commonly with some form of classification or clustering
algorithm, or with a model-based approach by using training data to
train a model that is then in real-time used for inference.
As to where to do the processing, it can either be done centrally,
on large servers, or more decentrally on the edge. If the concept of
edge computing is taken to the extreme the computing is done on the
microcontrollers that the sensors are connected to. TinyML is the
concept of doing machine learning on the edge on microcontrollers.
This thesis will cover some types of analysis that can be done with
machine learning as well as where this processing should be done as
well as the possibility of using TinyML to do the bulk of the analysis
directly on the microcontrollers that the sensors are connected to.
Some examples of potential applications of TinyMl are also covered,
both with the use of accelerometer data.

Keywords:
TinyML, Machine Learning, Embedded Systems, Edge, Decentralised

1
Contents

1 Preface 4

2 Introduction 5

3 Machine learning 6
3.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . 7
3.4 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4.1 Training . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Anomaly Detection 11
4.1 Categories of anomaly detection. . . . . . . . . . . . . . . . . 11
4.2 Use cases of anomaly detection . . . . . . . . . . . . . . . . . 12
4.3 Anomaly types . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 Examples of anomaly detection methods . . . . . . . . . . . . 14

5 Edge Computing 17
5.1 Why edge computing? . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Why not edge computing? . . . . . . . . . . . . . . . . . . . . 17
5.3 Examples of edge computing . . . . . . . . . . . . . . . . . . . 17
5.4 Microcontrollers on the edge . . . . . . . . . . . . . . . . . . . 18
5.4.1 Are microcontrollers necessary? . . . . . . . . . . . . . 19
5.5 Hybrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.6 Environmental impact . . . . . . . . . . . . . . . . . . . . . . 20

6 TinyML 22
6.1 What is tinyml . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Motivation of TinyML . . . . . . . . . . . . . . . . . . . . . . 22
6.2.1 Power consumption . . . . . . . . . . . . . . . . . . . . 22
6.2.2 Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.3 Difference to conventional Machine Learning . . . . . . . . . . 23
6.4 TensorFlow vs TensorFlow Lite vs TensorFlow Lite Micro . . . 24
6.4.1 TensorFlow . . . . . . . . . . . . . . . . . . . . . . . . 24
6.4.2 TensorFlow Lite . . . . . . . . . . . . . . . . . . . . . . 24
6.4.3 TensorFlow Lite Micro . . . . . . . . . . . . . . . . . . 25
6.5 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.6 FlatBuffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.7 Adapting models for microcontrollers . . . . . . . . . . . . . . 30

2
6.8 Computational and hardware need . . . . . . . . . . . . . . . 30
6.8.1 Training and deployment . . . . . . . . . . . . . . . . . 30
6.8.2 Deployed . . . . . . . . . . . . . . . . . . . . . . . . . . 32

7 TinyML for pattern recognition and maintenance prediction 33

7.1 Example of simple TinyML applications: Magic wand . . . . . 33
7.1.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1.2 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.2 Wake up phrase with TinyML . . . . . . . . . . . . . . . . . . 39
7.3 Maintainence prediction in marine diesel engines using TinyML.
40
7.3.1 Collected data . . . . . . . . . . . . . . . . . . . . . . . 40
7.3.2 Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . 43

8 Conclusion 49

9 Summary in Swedish 51
9.1 Inledning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9.2 Maskininlärning . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9.3 Identifiering av avvikelser . . . . . . . . . . . . . . . . . . . . 51
9.4 Kantberäkning . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9.5 Maskininlärning på mikrokontroller - TinyML . . . . . . . . . 52
9.6 Analys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
9.7 Sammanfattning . . . . . . . . . . . . . . . . . . . . . . . . . 54

3
1 Preface
With evergrowing data generation on the edge, processing the data on the
edge becomes ever more valuable. Simultaneously machine learning is also
growing in popularity. This thesis focuses on the combination of these two
in the form of TinyML.
This thesis was written with the guidance of Prof. Jerker Björkqvist who
provided excellent feedback.

4
2 Introduction
In today’s world with an accelerating amount of devices, data is often gathered
which is then later analysed to be used as feedback in some form. Most of
this computation is done in the cloud, which means that large amounts of
data need to be transported to, stored and analysed at a central location.
This means that a large amount of computation is needed in one place and
large links are needed to send the data to the central location and a large
amount of storage is needed there for all the raw data from all the deployed
devices equipped with sensors.
This is why edge computing, which as the name implies, does the com-
puting at the edge as close to where the data is generated as possible. Doing
the processing close to where the data is generated can also have advantages
in that the latency for feedback is lower and therefore performance can be
better. By doing the computation on the edge nodes cost savings are also
achieved by not needing a large amount of central computation or a large
link to the central location.
These are especially factors onboard a ship where a link to the cloud
onshore can not be taken for granted while out at sea and use of a high
bandwidth uplink, though Ethernet or WiFi while at port or cellular while
close to shore, is limited by lack of proximity to the port/shore.
This thesis will research whether it is possible to predict maintenance
needs using machine learning by the use of TinyML, computed on the edge.
TinyML enables using ML models on the edge close to or on the data-
gathering embedded devices.
This scope is specifically for medium size marine diesel engines. Similarly
to how a marine engineer onboard a ship can listen to the engines in the
engine room and recognize that some sound seems off, and sometimes even
pinpoint where the change is coming from, a model should also be able to
be trained to detect these changes. The hypothesis of this thesis is that it
might be possible to train a model to recognize maintenance needs based on
the vibrations gathered by accelerometers.
In this thesis, I will analyse data gathered on a Roll-on/Roll-off ferry
that operated in the Baltic Sea between Finland and Sweden. This ship
is equipped with four Wärtsilä 12V32 4SA four-stroke diesel engines. Each
of the four engines had sensor units with accelerometers attached to the
engine block directly and the engine frameset. The engine room also had
two sensor units that were used as a reference as well as to monitor the
compute hardware enclosure temperature. All the described sensors were
retrofitted.

5
3 Machine learning
Machine Learning shortened as ML is a subset of Artificial Intelligence (AI).
In Machine Learning a Machine meaning a computer ”learns” about a phe-
nomenon of interest by using:

• Training data which consists of as much and clear data as possible

where the phenomena occur.

• A diverse set of algorithms for analysis, training and inference.

Based on this training data and a Machine Learning algorithm a model

can be created that is able to predict an output based on a given input.
Machine learning can be divided into Supervised, Unsupervised and Re-
inforcement Learning:

3.1 Supervised Learning

In Supervised Learning (SL) the data is tagged, this means that the algorithm
can know from the tags what the desired output is on a set of inputs. When
a SL algorithm is fed training data, which consists of input data and desired
output data(tag) it produces an inferred function which can be used to map
before unseen input data. For this to work well the algorithm needs to be
able to generalize the data and not simply remember the data exactly, which
is called overfitting.
There are multiple algorithms used in SL some common examples are:

• Support-vector machines: A supervised learning model that uses clas-

sification algorithms to sort data into two groups[1]. Compared to
Neural networks the computation needed is lower but the classification
also has to be simpler.

• Regression Models: Regression is used to describe the relationship

between two variables by fitting a line to the observed data. With
Linear regression, a straight line is fitted to the data while with both
logistic and nonlinear a curved line is fitted[2].

• Neural Networks are often created with Supervised Learning meaning

labelled training data is used in training (described more below).

6
3.2 Unsupervised Learning
In Unsupervised Learning (UL) the data is not tagged.
UL can be broadly classified into Probabilistic and Neural Networks:

• The two main methods types of Probabilistic UL are principal com-

ponent and cluster analysis:

– Principal component is used to find the vectors that best describe

the axes along which the data follows.
– In cluster analysis the algorithm, as the name implies, segments
the data into datasets with common attributes. Hence cluster
analysis is used to group untagged data into clusters with com-
monalities and can be used to detect anomalies.

• Neural Network (described below).

3.3 Reinforcement Learning

In Reinforcement Learning (RL) the algorithm is given a numerical perform-
ance score in guidance to the desired output. This way the algorithm can
achieve an optimal or near-optimal result in a similar way to what happens
in the biological brain with positive reinforcements, such as pleasure or in-
gestion of food, and negative reinforcement, such as pain or hunger. When
the performance of the algorithm is analysed a notation of so-called regret
can be given which is the difference between the wanted outcome and the
outcome.

3.4 Neural Networks

Neural Networks (NN) can be both supervised and unsupervised. Neural Net-
works or sometimes called Artificial Neural Networks (ANN) are computing
systems that are slightly inspired by the naturally occurring biological neural
networks that exist in the brains of animals.
NNs consist of artificial neurons and edges. An artificial neuron loosely
resembles its biological counterpart the neuron by processing the received
signal input, by some form of a non-linear function applied to the sum of
the inputs i.e. the Sigmoid function, and outputting that new signal to
other artificial neurons that might be connected through edges. The signals
consist of real numbers. The network is trained by adjusting weights that are

7
associated with the neurons and edges. These weights are multipliers that
affect the signal downstream of that neuron or edge. This means that neural
networks are weighted graphs.
Neural networks are often arranged in layers, especially in deep learning.
If the neurons are connected to all neurons in the layers above and below the
network is called fully connected however, multiple patterns of connection
exist.
The first layer that receives external input is called the input layer and
the last layer which gives output is called the output layer. Between the
input and output layers, there can be zero to multiple so-called hidden layers.
Single-layer and unlayered networks also exist.
A neural network can be configured to either only feed information for-
ward in the network (feedforward neural network) or have a form of memory
from earlier input data (recurring neural network):

• Feedforward Neural Networks (FNN) are the simplest neural networks

since their output is only dependent on a single set of input data. FNNs
are set up such that the layers in it only have connections from the layer
directly before it and to the layer directly after it, such that no loops
are formed. This means that the network only is a feedforward neural
network if it is a directed weighted graph. As a consequence data only
flows one way in an FNN.

• Recurring Neural Networks (RNN) on the other hand are set up in such
a way that connections can form loops. This means that the output of
an RNN is dependent on multiple successive sets of input data. This
means that an RNN can have internal memory and can therefore either
process variable-length input data or more easily process consecutive
data sequences where the data is only valid in the correct order i.e.
speech recognition.
The most common RNN type is Long Short-Term Memory (LSTM)
ANN which is an RNN with the addition of long-term memory in the
form of an internal state where context relating to the current data
sequence can be stored. This long-term memory is used to help with
the vanishing gradient problem, though LSTMs can also suffer from it.
The vanishing gradient problem occurs because over multiple cycles a
”memory” based on previous input data may vanish due to trending to
zero or infinity. Examples of uses of LSTM ANNs are speech recognition
and machine translation.

8
3.4.1 Training
When an Artificial Neural Network (ANN) is trained the term backpropaga-
tion is often used. Backpropagation is an algorithm that is used for the
backwards propagation of errors using gradient descent. Backpropagation is
used to adjust the weight values of the network. This is done backwards,
starting from the outputs, since the desired outputs are known and errors
can then be calculated by comparing the current value and the desired state.
Gradient descent is then used to adjust the weights such that errors are min-
imized. This is done for each weight and the new weights are applied at
the end of a learning iteration. When training an ANN multiple learning
iterations are used to increase performance.
Before training is started the desired type of network needs to be chosen.
Depending on the data, the number of inputs and outputs need to be chosen
before training starts.
When a neural network is trained, some static hyperparameters need to
be set before the training is started. Some relate to the network structure,
and some to the training algorithm.
Some examples of hyperparameters are[3]:

• Network structure:

– Number of hidden layers and neurons:

As described above a NN can have differing numbers of layers and
neurons in those layers, which need to be chosen before training
starts. Having a network that is too small will cause underfitting.
– Dropout:
Dropout is a technique used to avoid overfitting. With dropout, a
larger NN can be started with and later some neurons are dropped.
This means that in the end, the final model will be smaller than
what was started with.
– Network weight initialization:
A uniform distribution of starting weights is most commonly used
but sometimes some other scheme might be helpful.
– Activation function:
Activation functions are used to get non-linearity into the model.
Most commonly the Rectifier activation function is used, other
examples are Sigmoid and Softmax.

• Training algorithm:

9
– Learning rate:
When the network is trained a learning rate is set which sets the
size of the step taken to adjust the model. A large learning rate
shortens the training time however at the expense of the precision
of the model at the end. An adaptive learning rate can be applied
in order to decrease training time and increase precision and avoid
oscillations of the weights.

Figure 1: Learning rate

– Momentum:
Momentum helps to avoid oscillations by knowing the direction of
the next step.
– Number of epochs:
The number of times the full training data is shown during the
training.
– Batch size
The number of samples given to the network before an update
happens.

10
4 Anomaly Detection
When doing maintenance prediction anomalies can be good predictors of
parts starting to fail or sub-optimal running and of maintenance being needed.
As IBM[4] states ”Anomaly detection is a process in machine learning
that identifies data points, events, and observations that deviate from a data
set’s normal behaviour. And, detecting anomalies from time series data is
a pain point that is critical to address for industrial applications.” Anomaly
detection has a large interest in order for us to solve many problems.

4.1 Categories of anomaly detection.

As with Machine Learning in general, anomaly detection can also be done in
three ways:
• Supervised anomaly detection: As with supervised machine learning
supervised anomaly detection also requires labelling of all the data.
But with anomaly detection, the data is labelled ”normal” or ”anom-
aly”. Supervised anomaly detection is barely used due to the need for
labelling which rarely exists and the challenge of labelling anomalies
since by their nature the anomalies can be chaotic.
• Semi-supervised anomaly detection: With semi-supervised anomaly de-
tection again as with ML only some of the data is labelled. In this way,
the labelling is easier since the data points that are known to be nor-
mal or anomalous can be labelled as such and the rest is left blank.
Sometimes a model can be built of what is known to be normal data
so that going forward a prediction can be done whether a data point is
normal or anomalous.
Semi-supervised anomaly detection is more common than supervised
anomaly detection since it requires less labelling but labelled anomalies
are rare so there might not be enough data to generalize the anomalies
and there might be anomalies that stem from different phenomena as
opposed to the captured and labelled anomalies.
• Unsupervised anomaly detection: As with unsupervised machine learn-
ing unsupervised anomaly detection, the classification is not based on
labelled data to build a model but simply algorithms to detect data
that stands out from the norm.
Unsupervised methods are the most common since labelling the data
is so expensive.

11
4.2 Use cases of anomaly detection
Anomaly detection can be applied to many things. Some examples are, as
given in[4]:

• Outlier detection: Outlier detection is used to detect any outliers or

data that largely varies in range from the normal operating range or
state of the system within the training data. All of the data is analyzed
to find outliers.

• Novelty detection: Novelty detection is done by training with the use

of data that is known to be gathered during normal operation without
anomalies. The goal then is to analyze testing data to see whether
there is any novel behaviour and if so label the data as anomaly or
novelty.
To do this some data that is known to be normal is needed, this is a
form of semi-supervised anomaly detection.

• Event extraction: Event extraction can be done if the data operates in

different states, for example, a sensor could be on or off. The goal is to
detect when sensors behave differently during an event.

• Data cleaning: Data cleaning can be done to remove outliers or sudden

infrequent changes in the distribution of data. This is often done before
needing the data for something that benefits from cleaner data.

4.3 Anomaly types

When trying to find anomalies there are different types of anomalies that can
occur in data, as given in[5]:

•
Figure 2: Point anomaly, source:[5]

• Collective anomaly: If an anomaly is represented by multiple data

points.

Figure 3: Collective anomaly, source:[5]

• Contextual anomaly: If an anomaly is represented by a data point in

context to other data points. By itself, the data point might be normal
but in context, it might be anomalous.

13
Figure 4: Contextual anomaly, source:[5]

4.4 Examples of anomaly detection methods

• K-Means clustering:
With K-Means clustering the data consisting of n observations is di-
vided into a k number of clusters. The observations are assigned to
clusters based on the lowest Euclidian square distance to the centroid
of the cluster, however, at the start, the clustering centroid is just a
random seed value.
Although K-Means clustering seems simple and effective it does have
drawbacks:

– The number of clusters (k) needs to be chosen, though the elbow

method is of great assistance for this. In the elbow method, the
sum of the squares and the numbers of clusters are plotted, and
then by looking a the ”elbow” in the graph, a suitable size of K
can be extracted. This of course requires iterative running of the
algorithm while incrementing K.
– Only works if clusters belong together due to proximity.
– The computation is NP-hard because each observation needs to be
computed against all centroids. However, heuristics can be used
to shortcut to an approximate result and only from there forward
computing all relations.

• Local Outlier Factor[5]:

14
Local Outlier Factor is a density-based method for finding anomalies.
For each observation, the nearest neighbours are calculated. Then with
the computed neighbourhood, the local density is computed with Local
Reachability Density. Finally, the LOF score is calculated by compar-
ing the LRD and the previous Nearest neighbour.

15
Figure 5: Anomaly detection algorithms to choose from, source:[5]

16
5 Edge Computing
Edge computing at its simplest is putting the computation as close to the
edge, where the data is created and used, as possible or as IBM states:
”Edge computing is a distributed computing framework that brings en-
terprise applications closer to data sources such as IoT devices or local edge
servers. This proximity to data at its source can deliver strong business bene-
fits, including faster insights, improved response times and better bandwidth
availability.”[6]

5.1 Why edge computing?

When utilizing edge computing compared to conventional centralized com-
puting (Cloud) a majority of the processing is done close to the source of
the data and actuators that might need instructions based on the data. This
means that the data does not need to be sent to a centralized server which
would add delay, and need comparably larger network throughput. Histor-
ically this has not been much of a problem due to the number of remote
nodes not being that high, however, deployments are growing all the time
due to cheaper sensor devices being available, which makes more large-scale
projects financially feasible, which can create enormous amounts of data. In-
tel quotes: ”It’s estimated that by 2025, 75 percent of data will be created
outside of central data centres, where most processing takes place today” [7]
Edge computing is especially beneficial when nodes are deployed in remote
places with limited connectivity such as ships out at sea.
The origins of edge computing can be seen in content delivery networks,
which were created in the late 1990s predating edge. CDNs are implemented
by deploying some of the servers closer to the users in order to reduce latency
and data sent over larger distances.

5.2 Why not edge computing?

When a system is deployed that does not generate significant amounts of data
and the uplinks from the nodes are sufficient a centralized computing system
might be preferable due to simplicity due to a large amount of redundancy
in many edge computing systems.

5.3 Examples of edge computing

• Autonomous vehicles:

17
Autonomous vehicles are probably one of the applications that are most
talked about. In self-driving cars, decisions need to be made extremely
quickly and no delay is tolerated. Hence, the processing needs to be
done on the edge, in the car itself, because obviously if mistakes happen
people can be seriously injured if not killed due to large masses moving
at high speeds close to each other.
Examples of self-driving systems on the OEM side are Teslas Autopilot
and on the aftermarket side Comma.ai’s Openpilot. While the imple-
mentations of these systems vary, with Tesla using more sensors beyond
cameras and Comma focusing on just cameras since humans only need
sight to drive, they both rely on edge computing to make decisions in
time.

• Computer vision:
For the application of computer vision, edge computing is quite clearly
the way to go due to the large amount of data gathered by the camera
sensor(s), especially with moving objects at high resolution. By pro-
cessing the images at the point of capture only data that is interpreted
needs to be sent forward, and if actuation is needed it can be done
immediately once the processing is done. Examples, where computer
vision is used, are input for autonomous vehicles and barcode readers.

• Industrial automation and control:

Similarly to how autonomous vehicles need quick response times in-
dustrial automation and control need good response times for descent
precision while a vast processing capacity is not needed. An example
of this is Programmable logic controllers (PLC) which are used for a
large portion of industrial automation due to their ability to be recon-
figured more easily compared to conventional circuits they largely have
replaced.

5.4 Microcontrollers on the edge

Since the advantage of edge computing is not needing to send much data
forward, if any, due to the processing being done there. Doing it on the
devices that the sensors are connected to would be beneficial. This means
that most if not all of the processing is able to be run on the microcontrollers
and barely any data need to be sent compared to the amount of raw data
gathered from the sensors. The decreased ratio is of course highly dependent
on the type of data that is gathered and the processing that is done on the

18
data. Often a microcontroller is needed anyway to connect the sensors, so
doing processing on them might be beneficial. Though the microcontroller
of course needs to have enough address space and computational power in
order to sufficiently do the wanted computations as well as both gather the
sensor data and sustain communications to other nodes in the system.

5.4.1 Are microcontrollers necessary?

When looking at edge computing a microcontroller can look very tempting
since it establishes the computation directly next to the sensors. But there
are of course pros and cons to using microcontrollers compared to doing the
processing further downstream on a system with more performance and a
full Operating system:
Pros:

• Price: The price of microcontrollers is significantly cheaper than fully-

fledged systems with OS and all, especially if there is one node for each
sensor location. It might also be possible to utilize a microcontroller
that is already needed in order to connect a sensor to the system.

• Scale up well: With microcontrollers, once there is an implementation

in place it is quite easy to scale up the system if most of the computation
is done on the deployed microcontroller devices. If the computation
is done more centrally, the processing infrastructure then also needs
to grow at a sufficient pace such that the computational capacity is
sufficient at all times for the data throughput.

• Power consumption: Power consumption is a hard metric to compare

and could also be a con. However, when comparing the processing
on microcontrollers on the edge versus on large servers more centrally,
doing it more centrally has more overhead which most likely means a
larger net power consumption. Some overheads are:

– More data needs to be sent.

– The servers need to run operating systems which have some over-
head compared to microcontrollers more directly running the code
(though microcontrollers can also sometimes have an OS).

However, on some systems where the edge nodes have a strict power
budget, for example, due to running on battery and/or photovoltaic
cells, it might be better to not do any processing on these nodes and
instead send the data to nodes that do not have a strict power budget.

19
Though sending large amounts of data can also need a considerable
amount of power so some pre-processing might be necessary to achieve
the lowest power consumption possible.

• Size: The size of a single microcontroller is much smaller than a fully-

fledged system and in many cases, the microcontroller and the sensors
can be implemented on the same printed circuit board. In some cases,
this might also make the system mechanically more reliable since the
need for connectors could be reduced.

Cons:

• Development time: It is harder and takes longer to implement everything

on microcontrollers versus doing some parts on generic hardware.

• Address space: Larger data sets can be loaded on servers and therefore
a larger context could be used in the computation.

• Performance: More complex computations can be done on servers since

they have a larger amount of computational power.

• Centralized implementation: Sometimes it might be better to use an

implementation that for example relies on a single high-performance
system using, for example, computer vision to monitor something in-
stead of a fleet of sensors attached to microcontrollers scattered around.

5.5 Hybrid
When doing computation on gathered data the approach used does of course
not have to be 100% edge computing or centralised. Usually, a better ap-
proach might be to do some simple pre-processing on the edge that can
drastically condense the data needed to be sent, while only needing a small
amount of computational performance of the edge nodes. Then on a central
server, the condensed data from multiple nodes could be further processed
with the advantage of wider context by using data from multiple nodes. The
degree to which how much processing is done where varies considerably with
the implementation at hand.

5.6 Environmental impact

When thinking about the environmental impact of implementing a system
either by using microcontrollers or a larger computational device it is of

20
course highly dependent on what is attempted to be solved and the way
the problem at hand is approached. If the problem is simply connecting a
number of spread-out sensors and doing some processing of the data from
those sensors, using microcontrollers for each node of course has a smaller
environmental impact as well as cost compared to a larger system.
It becomes more complicated to compare if different approaches are used:
For example, doing object tracking by using microcontrollers attached to
each tracked object compared to using a single larger system that has the
performance to do the task with computer vision. So keeping this in mind
either can be the better choice depending on the application and implement-
ation.

21
6 TinyML
In this chapter, I will paraphrase significantly from the book ”TinyML Ma-
chine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Mi-
crocontrollers” by Pete Warden & Daniel Situnayake [8]

6.1 What is tinyml

TinyML is defined as running a neural network on an energy cost of less
than 1mW [8]. Warden clarifies this seemingly arbitrary number as what is
needed to achieve a battery life of one year with a coin battery.

6.2 Motivation of TinyML

The two main motivations for using TinyML are power consumption and
cost.

6.2.1 Power consumption

As a comparison, the extremely popular and excellent Raspberry Pi boards
use in the order of hundreds of milliwatts and x86 processors in laptops and
desktops use between a few watts to a few hundred watts. Both are way too
much in order to have long battery life in the field. Whereas microcontrollers
use in the order of milliwatts, this, of course, depends largly on the duty cycle
and clock speed of the implementation. A large factor however is what kind
of peripherals are connected as these use power too. Communication radios
especially can use a significant amount of energy if used frequently, especially
at longer ranges due to the inverse square law.
To compare this to power sources:
• A CR2032 coin battery might hold 2500 J. (About one month at 1
mW.)
• A AA battery might hold 15000 J. (About six months at 1 mW.)
• Harvesting temperature differences from industrial machines might gen-
erate 1 to 10 mW per square centimetre.
• Indoor photo voltaic cells might generate 1µW per square centimetre.
• Outdoor photo voltaic cells might generate 1mW per square centimetre.
So when it comes to self-powering devices only outdoor photo voltaic and
industrial temperature difference generation is currently viable.

22
6.2.2 Cost
When comparing microcontrollers to systems intended to run an operating
system as well as the needed application(s) it is noticeable that microcon-
trollers are much cheaper. As a comparison, the cheapest Raspberry Pi, the
Raspberry Pi Zero, is about 5€ and can be used as a server. However, more
typically a larger x86 server is used which can cost anywhere from 1000€ to
100 000€. When comparing this to 32-bit microcontrollers that are available
for much less than 1€ the difference is significant if large deployments are
needed. Though the comparison is not straightforward since a server can do
the processing of the data from a large array of sensor-gathering nodes.
These same microcontrollers also have and will benefit from traditional
analogue and electromechanical control circuits being able to be replaced
with software-defined alternatives on microcontrollers which will continue to
further bring down the price, as well as flexibility, as more microcontrollers
are produced of devices installed on the edge.

6.3 Difference to conventional Machine Learning

TinyML differs from conventional ML in a few ways since it runs on micro-
controllers:

• As stated earlier power consumption is kept low in order to have the

possibility of a deployment which is self-sufficient in terms of power
consumption for long deployments (months/years).

• Embedded 32-bit chip means that little ram is available, a few hundred
kilobytes, which means that models have to be kept small.

• No full Linux since a memory controller and 1Mb of ram would be

needed which is not available on a microcontroller. By not having an
OS the system does not have any other processes running than the one
being developed meaning the system is simpler to understand.

• Dynamic memory is often avoided since it is not needed and not using it
increases reliability and makes the implementation more deterministic.

• Training is not done on the device but on a server/workstation and

then quantized and loaded onto the device. This is done because it is
simply not possible due to the hardware restraints.

• Weights are often quantized to 8-bit integers after training before be-
ing loaded onto the device due to floating point arithmetic not being

23
guaranteed on microcontrollers. By going to 8-bit integers from 32-bit
floating point precision is obviously lost, however, training requires the
largest dynamic range and it is still done with 32-bit floating point so
no precision is lost there.

6.4 TensorFlow vs TensorFlow Lite vs TensorFlow Lite

Micro
6.4.1 TensorFlow
As Warden and Situnayake compactly describe it: ”TensorFlow is Google’s
open source machine learning library, with the motto “An Open Source Ma-
chine Learning Framework for Everyone.” It was developed internally at
Google and first released to the public in 2015. ”[8]

6.4.2 TensorFlow Lite

However, TensorFlow is aimed at servers where there are gigabytes of RAM
and terabytes of storage. This is why Google developed TensorFlow Lite
with lower size requirements in order to easily run neural networks on mobile
devices. In order to decrease the size some features were cut:

• No training can be done with TensorFlow Lite. All training of mod-

els needs to be done on (a) server(s)/desktop(s) in the fully-fledged
TensorFlow.

• Does not support all data types, for example, double precision floating
point.

• Less used operations such as tf.depth3em

are dropped.

Because of this TensorFlow Lite can fit into a few hundred kilobytes which
makes it able to fit into size-constrained applications. TensorFlow Lite also
has good support for 8-bit quantization of networks. Comparing the size of
8-bit versus 32-bit values a 75% savings in space can be achieved, assuming
a dense mapping of 8-bit values is supported.

24
6.4.3 TensorFlow Lite Micro
While TensorFlow Lite with its constraints is good and compact enough for
mobile devices, microcontrollers have even tighter constraints than this and
that is what TensorFlow Lite Micro is created for. When the Google team
started creating TensorFlow Lite Micro they knew that they would have a
bunch of constraints running it on microcontrollers:

• No operating system dependencies

Since some of the platforms they were aiming for don’t have operating
systems at all and machine learning algorithms fundamentally are a
bunch of mathematical calculations, avoiding dependencies would en-
able compatibility.

• No standard C or C++ library dependencies at linker time

Since the devices aimed for with Tensorflow Lite Micro only have very
limited memory and even seemingly simple functions can take up relat-
ively large amounts of memory, linker time parts of the libraries wanted
to be cut to keep lean. An important exception to this is the C math
library which is obviously valuable for all the calculations done.

• No floating point hardware expected

As many embedded platforms don’t have floating point arithmetic
hardware the implementation can not rely on floating point in performance-
critical calculations and instead needs to focus on 8-bit integer arith-
metic (however there is also support for floating point if needed).

• No dynamic memory allocation

In many implementations, microcontrollers need to continuously run
without rebooting for extended periods of time, months if not even
years. Therefore using dynamic memory allocation can cause memory
fragmentation and is therefore not reliable. While developing with
dynamically allocated memory it is also significantly harder to know
exactly how much memory is used and it might not cause a problem
until after testing. Therefore TensorFlow Lite Micro is implemented
using a specified fixed-size arena at initialization. If the specified arena
is too small an error is returned immediately and the area will need to
be enlarged and recompiled.

• Requires C++11

25
The TensorFlow Lite Micro was written in C++11 in order to keep
consistency with TensorFlow Lite and for the ability to not have to
rewrite it from scratch. So the team decided to trade support for older
devices with sharing code with TensorFlow Lite.

• It expects 32-bit processors

Similarly to the last point, the development team decided to keep con-
sistency with TensorFlow Lite by expecting 32-bit processors.

When the team developing TensorFlow Lite Micro was deciding how to
implement the model on microcontrollers they compared the advantages and
disadvantages of an interpreted model and code generation.

Interpreted model :
With an Interpreted model the model is loaded into data structures that
define the model that is separate from the executed code which is static.

Code generation :
With code generation, the model is generated into C or C++ code with
parameters stored as data arrays and the architecture expressed as a series
of function calls. The generated code is often comprised of a single large file
with a few entry points that can be included with the other code needed and
then compiled.
Here are some key advantages of code generation:

• Ease of build
Since the model is defined directly in code without dependencies it is
easy to implement since it can just be copied into the code and then
compiled together with the rest of the needed code.

• Modifiability
Since all the code is in a single file and without dependencies it is easy
to modify it without needing to know and find what parts of libraries
are included.

• Inline data
Since the model is implemented in source code no additional files are
needed and therefore no loading or parsing is needed.

26
• Code size
If the platform and model are known only needed code needs to be
included keeping the size down.

Disadvantages of code generation:

• Upgradability
If you have locally modified the code and you then want to upgrade to
a newer version of the framework it might entail a significant amount
of work to patch your changes and the updated framework together.

• Multiple models Having multiple models with code generation also

means that there will be a significant amount of source duplication
which can be harder to support.

• Replacing models Since the model is expressed as generated source code

each time the model needs to be changed a recompilation is needed.

However, the team realized that many of the advantages of code genera-
tion can be had by using project generation instead.

Project Generation :
In TensorFlow Lite project generation creates copies of only the files needed
to build a model and optionally sets IDE-specific project files so that they
can be built easily. Project generation retains most of the advantages of code
generation but also adds some:

• Upgradability All source files are copies of the original and kept in the
same place in the folder hierarchy. This means that if local changes are
made upstream upgrades can be merged using standard merge tools.

• Multiple and replacement models As the underlying code is an inter-

preter the models can be changed out without compilation or multiple
models can be used.

• Inline data Model parameters can still be compiled into the program
if needed so no unpacking or parsing is needed. This is done using
FlatBuffer serialization format.

• External dependencies All the required header and source files are
copied into the project so no dependencies need to be separately down-
loaded and installed.

27
The largest advantage that does not come automatically is the code size
since the interpreter structure makes it hard to know which code paths will
never be called. In TensorFlow Lite this can be resolved using the OpResolver
mechanism to register only the kernel implementations expected to be used
in the application.

6.5 Quantization
Since microcontroller hardware is better suited for integer calculations, to
be able to use the model that has been trained with floating point numbers,
due to the numbers vastly fluctuating during training, the model needs to
be quantized from containing floating point numbers into integers before
deploying it to a microcontroller.
A reduction from 32-bit floating point to 8-bit integers also gives a 75%
reduction in storage needed for the completed model while not having a
noticeable impact on the accuracy of inference.
Another benefit of using 8-bit integers is that many signal processing
algorithms also use 8-bit integer multiply and accumulate instructions which
means that the same hardware can be utilized for TinyML.
Running a fully quantized model is also more efficient which gives us
better latency on almost all devices.
As quantization is an active research field there are many opinions on
how it should be done. For weights, this is somewhat easy since the range
is known for each layer after the training process. However, it is trickier for
activation since the range of the output is not known. If a range too small is
used there will be clipping at the maximum and/or minimum and if a range
too large is used the accuracy will suffer.
When using TensorFlow and TensorFlow Lite the quantization is done at
the same time as when the model is converted from a TensorFlow training
environment to a TensorFlow Lite graph. Two types of quantisation are done
when converting to a TensorFlow Lite graph are:

• Post-training weight quantization is the most accessible type of quant-

ization that only quantizes the weights to 8-bit integers, which reduces
the size of the model by 75% compared to 32-bit floating point but
leaves the activation layers as floating point numbers. This means that
the device the model is implemented on would still need floating point
support which can be rare for embedded devices. On the other hand,
quantization is easier to do since it does not require knowledge of the
activation layers.

28
• Post-training integer quantization is used to create a model that only
contains integers. This means that no floating point hardware is needed,
which is desirable since floating point hardware can be rare. However,
when doing the quantization context of the ranges of input needs to
be supplied in the form of example input that the model could expect
to receive while deployed. Having the right range will result in greater
accuracy without clipping.

6.6 FlatBuffers
In order to have efficient storage of the model FlatBuffers are used. ”Flat-
Buffers is an efficient cross-platform serialization library for C++, C#, C,
Go, Java, Kotlin, JavaScript, Lobster, Lua, TypeScript, PHP, Python, Rust
and Swift. It was originally created at Google for game development and
other performance-critical applications.”
Flatbuffers are well described by Warden et al.[8] and in the white paper
by Google[9] and here are the main points borrowed from there:

• Designed for performance-critical applications so works well for embed-

ded systems.

• The serialized form and runtime in-memory representation is exactly

the same so no parsing or copying is needed.

• With the help of schemas the Flatbuffer compiler creates native code (
C, C++, Python, Java...).

[10] The motivation for using FlatBuffers is to avoid the need to un-
pack/parse data. ”A FlatBuffer is a binary buffer containing nested objects
(structs, tables, vectors,..) organized using offsets so that the data can be
traversed in place just like any pointer-based data structure. Unlike most in-
memory data structures, however, it uses strict rules of alignment and endian-
ness (always little) to ensure these buffers are cross-platform. Additionally,
for objects that are tables, FlatBuffers provides forwards/backwards com-
patibility and general optionality of fields, to support most forms of format
evolution.” [9] FlatBuffers are generated with the help of a schema which
describes the object types which are used to compile efficient code for data
access.

29
6.7 Adapting models for microcontrollers
In order to be able to run the models on microcontrollers the models need
to be adapted to run on microcontrollers. This is done with a converter
that takes a trained model from Python and creates a TensorFlow Lite file.
However, there are some things to consider.

• Quantization, as mentioned earlier by quantizing the weights and ac-

tivation layers to integers, floating point hardware, which is not present
on many embedded devices, is eliminated as a requirement to run the
model. As well as making it more efficient, both in storage and com-
putation.

• All values that need to be variables during the training process, such
as weights, need to be turned into constants.

• Features only needed for training need to be removed.

• While the models are trained on desktops/servers the models can easily
become dependent on features of the desktop environment that are not
supported on microcontrollers. Such as snippets of Python code or
advanced operations. This needs to be resolved before deploying onto
microcontrollers.

• Most microcontrollers also don’t have a filesystem so the model needs to

be compiled into the executable. This then also means that every time
the model needs to be updated the executable needs to be recompiled
and re-uploaded to the microcontroller, meaning over-the-air updates
are somewhat harder.

• FlatBuffers are used so that the model data can be loaded into memory
without the need to unpack or parse it. A FlatBuffer is exactly the
same in memory as its serialized form this means that the model can
be directly accessed from flash memory without needing to copy or
parse it into RAM.

6.8 Computational and hardware need

6.8.1 Training and deployment
When doing machine learning on performance-constrained devices if a model-
based approach is used a model of course needs to be trained. While the
computational need for inference with the completed model does not require

30
much computation, creating it does. So in the same way as with conventional
ML, the computationally intense training is completed on a workstation or
server which is also able to use larger training data sets since it can have vast
amounts of RAM and non-volatile storage.
A decision also needs to be made on whether a specific set of hardware is to
be used for all devices or if generalized hardware is to be used. Meaning will
the system only use a defined type of microcontroller and sensors or will the
system be able to use differing hardware? In that case, some normalization
needs to be added to both:

• The magnitude of the data collected. For example, two different ac-
celerometers might output the same physical acceleration in different
magnitudes and data types digitally.

• The sample rate. In order to consistently detect phenomena the sample

rate needs to be consistent meaning that the time axis is not scaled
depending on what sensor is used. This way when a phenomenon hap-
pens it is captured in the same amount of samples independent of what
hardware it was captured on.

If in the future a performance increase of the model is desired based on

more data collected. In most cases, the model needs to be retrained on
more performant hardware. Depending on the way a model is loaded onto
deployed hardware might also limit how and when the model is updated. If
the model is only able to be loaded onto the devices by flashing someone
of course physically then needs to be able to reach each device and flash it,
which might not be physically or logistically possible due to the locations
and ways as well as the number of devices that are deployed. To mitigate
this over the air (OTA) updates can be used, however, then the system will
require both:

• Some form of communication hardware on the deployed devices as well

as a network in place to support the devices, this might already be a
requirement if connectivity to the remote devices is needed for other
purposes. For example, remote configuration or real-time feedback.

• A device as well as an implementation that supports downloading and

upgrading the model while deployed. This can be done in two ways:

– Only keeping the updated model in volatile memory and simply

switching to it once downloaded and risk losing the model if power
is lost and needing to fetch the model again on bootup from a

31
server. This of course will increase recovery time after a power
loss and might not even be possible if the network connectivity is
only intermittent.
– Writing the model to non-volatile memory. For this, the device
needs to have non-volatile memory that is able to be written to
during runtime.

6.8.2 Deployed
When it comes to the requirements of the devices used with TinyML many
of the hardware requirements come from where the system will be deployed,
for how long and what resources are available.
So when looking at the constraints there are two opposing constraints,
power consumption versus computational power and address space:

Power consumption When it comes to the power used, obviously, the less
that is used the longer a device can be deployed with a set amount of energy
stored or generated. As Warden [8] states a 1 mW or below energy cost can
make many new applications possible. As stated earlier this forces us to use
microcontrollers. Since the overhead in computation and storage to run even
a light operating system could consume more power than is available, not
running one is often a must. Also, the energy-saving sleep of the device is
significantly more complicated when an OS is involved.

Computation and address space As mentioned earlier TinyML needs

a 32-bit processor to run. This means that while the focus is on embedded
low-power devices any microcontroller won’t do. Since decent computational
power, as well as enough address space in order to fit and run the model
as well as the supporting framework, is still needed. The exact need of the
application is obviously hard to judge before it is implemented in code and
a model is trained when a model-based approach is used. Once these are
known the hardware can be easily optimized. This approach is of course
not very flexible if no computational/address overhead is saved for future
improvements and additions. Though unused future-proofing is an unwanted
expense.
Things that need to fit into the address space:

• Operating system size: If some feature of an OS is a requirement the

size of the OS in the required configuration to include the needed fea-
tures needs to be included.

32
• Tensorflow Lite Micro code size: In order to run the model the Tensor-
flow Lite Micro code of course needs to be included so that the Neural
Network and the operators implemented in the model can be used.
TensorFlow Lite Micro is designed to work with as little as 20KB of
flash and 4KB of SRAM in some applications.

• Model data size: The size of the model is of course very application
dependent and needs to be large enough in order to be able to generalize
the phenomena at hand.

• Application code size: The application code again is very application

dependent (as the name implies) and varies depending on the logic
needed for the application.

Example of size: As an example of the total size of all the pieces

that are needed for a TinyML application to run, I compiled the Magic
Wand example in the TinyML github[11]. The next section will describe this
example more. With the Arduino NANO 33 BLE as a target, the compiled
size is 172720 bytes, which in this case is 17% of the available storage.

Figure 6: Output from Arduino IDE after compilation of the magic wand
example from the TinyML GitHub[11].

7 TinyML for pattern recognition and main-

tenance prediction
7.1 Example of simple TinyML applications: Magic
wand
This example is taken from the TFLite-Micro GitHub repository[11]. It is
also covered in the book ”TinyML Machine Learning with TensorFlow Lite
on Arduino and Ultra-Low-Power Microcontrollers” by Pete Warden and
Daniel Situnayake [8].
In this example, a ”magic wand” is built, meaning a microcontroller
that is able to detect ”spells” that are cast by recognizing, with the help

33
of TinyML on accelerometer data, gestures that are made. The three ges-
tures that are recognized are the ”wing”, ”ring” and ”slope”.

Figure 7: What the gestures look like

The application can be divided into a few components:

• The main loop:

Runs all the components in order.

• Accelerometer handler:
Reads the values of the accelerometer in a way applicable to the hard-
ware in use and writes it to the model’s input tensor. In the case of
the Arduino Nano 33 BLE Sense the data is also down-sampled from
119 Hz to 25 Hz.

• TFLite interpreter:
Runs the TensorFlow Lite model. This is the interesting part and will
be covered next.

• Model:
Contains the underlying data about the gestures to be recognized gathered
during the training phase.

• Gesture predictor:
Takes the output of the model and decided whether a gesture has been
made based on probability and the number of consecutive positive pre-
dictions.

34
• Output handler:
When a gesture has been recognized outputs to LED light and the serial
port what gesture was recognized in a way applicable to the hardware
in use.

Figure 8: The components of the magic wand application

7.1.1 Performance
When executing the shapes I tried to mimic the way Pete Warden did them
in his presentations on YouTube. As a result, I can consistently perform
detectable wing shapes, however, the detection of the ring and slope shapes
is poor. As Warden states, they should be harder than the wing to perform
but I am not sure to what degree.
The execution of the shapes is checked by outputting the accelerometer
data after sub-sampling and axis normalization.

35
Wing Shape As can be seen in the plot (Figure 10) the accelerometer
data used for prediction is somewhat noisy so the task of detecting a shape
is challenging, especially considering the model needs to be kept very small.
But on the Z axis, a somewhat clear pattern to the motions of the wing
shape can be seen: peaks from the direction changes, but with some noise.
However, the X and Y axis are noisier and it is much harder to see any
pattern there other than slightly from the rotation of the device during the
execution. What is also interesting is that the model did not detect the
wing shape unless the shape was executed somewhat violently, meaning the
accelerometer clips. Considering all this the model does well for the wing
shape.

Figure 9: Illustration of execution of Wing shape [8]

36
Figure 10: Accelerometer data plot of a successful try to detect a wing shape.
When looking down at the Arduino NANO 33 BLE sense with the USB port
facing us, the axis are: X = Red, Y = Green and Z = Blue.

Ring Shape As can be seen in the plot (Figure 12) the attempts look
similar to the wing movement but when analyzing close there is a difference.
In the ring, a somewhat constant acceleration toward the centre of the circle
should be seen while the wing execution should have sudden direction change
peaks. However, the model is mostly not able to detect the Ring shape being
executed.

37
Figure 11: Illustration of execution of Ring shape [8]

Figure 12: Accelerometer data plot of a few unsuccessfully tries to detect a

Ring shape. When looking down at the Arduino NANO 33 BLE sense with
the USB port facing us, the axis are: X = Red, Y = Green and Z = Blue.

Slope Shape As can be seen in the plot (Figure 14) when executing the
Slope shape there is some structure to it with the three phases of acceleration
(start, direction change and stop) but the model seems to struggle to detect
the execution.

Figure 13: Illustration of execution of Slope shape [8]

38
Figure 14: Accelerometer data plot of a few unsuccessfully tries to detect a
Slope shape. When looking down at the Arduino NANO 33 BLE sense with
the USB port facing us, the axis are: X = Red, Y = Green and Z = Blue.

Performance conclusion So either I am interpreting the execution of the

shapes wrong or the model is extremely honed in on the way Warden executes
them.

7.1.2 Epilogue
This example has since been removed[12] from the examples in the repository,
but it can still be found in the repository history.

7.2 Wake up phrase with TinyML

An example of where a TinyML-like approach is already deployed and many
people are familiar with is the voice assistants that use a wake-up phrase.
For example, Google’s Google Assistant responds to the wake phrase ”Ok
google” and Amazons Alexa responds to the wake-up phrase ”Alexa”. Both
of these have implementations in IoT devices for Google with the Google
Home devices (now called Nest) and for Amazon with the Amazon Echo.
Both of these use light models for the detection of the wake-up phrase.
Because it is possible to detect these somewhat simple phrases locally there is
no need to send the audio of these to a server to be recognized. It saves both
a significant amount of network bandwidth as well as the need for huge server
infrastructure to constantly try to recognize when someone simply wants to
have the device respond. Though these devices are not battery powered these

39
same assistants are also implemented on mobile phones, which are somewhat
battery constrained (not as much as battery IoT devices but still) and benefit
from the lighter implementations.
However once the wake-up phrase has been detected the following speech,
containing the actual request, is sent to the cloud to be parsed and acted upon
since it is much more complex to parse.
However, both of these systems are largely proprietary so from the out-
side, it is hard to know exactly how they work.

7.3 Maintainence prediction in marine diesel engines

using TinyML.
The data used in this example was collected in collaboration between Åbo
Akademi University and Wärsilä Finland Oy. This was done and docu-
mented, by Andrei-Raoul Morariu et al. in the article Edge-based vibration
monitoring of marine vessel engines[13] which I will paraphrase.

7.3.1 Collected data

The data was collected on a cruise ferry travelling in the Baltic Sea between
Finland and Sweden. This ship uses four Wärtsilä 12V32 4SA four-stroke
diesel engines for its traction which are fitted with sensors on the engine
blocks and stands. The sensors used were Texas Instruments Sensor Booster
Pack for temperature, accelerometer and gyroscope enclosed in metallic boxes
to protect them from factors such as temperature and humidity from now on
called SU.
The sensors are connected to Raspberry Pi 4’s from now on called PU,
though I2C, each SU were connected to a PU on a dedicated I2C bus. I2C
was chosen due to compatibility between the SUs and PUs. The PUs were
installed in two enclosures two PUs in each. In the enclosures, there is
also a power supply and a switch to connect the PUs. These switches are
then connected to a switch in the control room which has connected to it
an Industrial PC that collects the data and sends it to shore when possible
through different means.

40
Figure 15: Inside the enclosure.

Eight SUs have been placed in the following order:

• Two on each engine:

– The first one on the engine block.

– The second one on the engine frameset, in close proximity to the
first one.

• One in each enclosure (of which there are 2)

The SUs were connected to the two PUs in the two enclosures following
order:

• The first PU is connected to 3 SUs:

– The SU in the enclosure.

– One SU from the first or third engine.
– One SU from the second or fourth engine.

• The second PU is connected to two SUs:

41
– One SU from the first or third engine.
– One SU from the second or fourth engine.

This configuration is used due to redundancy.

Figure 16: Installation setup

42
7.3.2 Analyses
For the analysis, an arbitrary data period is chosen since the data should be
somewhat cyclic due to the ferry operating on the same route generally. The
acceleration dataset used consists of 20 million samples of four variables:

• A Timestamp in 64-bit (integer) epoch time.

• The acceleration for each axis ( x, y, z) in a arbitrary unit as a float64.

The dataset is complete and contains no field without a value (null).

Though there are some fields that have a value of exactly zero for acceleration
in the x and y axis though that should be rare in accelerometer data. This
might either be due to the system writing a zero in case of an error or simply
by chance since the x and y-axis hover around zero. The z-axis does not
hover around zero since it is aligned with gravity.
The number of zero values per axis:

• X: 8689

• Y: 3303

• Z: 0

In figures 17, 18 and 19 the acceleration against time is plotted to see the
general behaviour of the data. By doing this we can see that there is some
noise in the data, more so in the x and z-axis than in the y-axis. Interestingly
the noise for the x-axis goes to 2500 for the most part but slopes off at the
ends whereas the noise for the other axis goes to zero arbitrarily throughout.
From this, we can also see that the z-axis is up and down in the real world
since it has a constant DC bias reflecting the gravity of the earth.

43
2500

2000

1500

1000

500

0
01-22 12

01-22 15

01-22 18

01-22 21

01-23 00

01-23 03

01-23 06

01-23 09

01-23 12
Figure 17: Raw plot off acceleration off accelerometer, x-axis = date and
y-axis = acceleration

600

400

200

400

600
01-22 12

01-22 15

01-22 18

01-22 21

01-23 00

01-23 03

01-23 06

01-23 09

01-23 12

Figure 18: Raw plot off acceleration off accelerometer y-axis, plot x-axis =
date and y-axis = acceleration

44
1400

1200

1000

800

600

400

200

0
01-22 12

01-22 15

01-22 18

01-22 21

01-23 00

01-23 03

01-23 06

01-23 09

01-23 12
Figure 19: Raw plot off acceleration off accelerometer z-axis, plot x axis =
date and y-axis = acceleration

Since the data in the y-axis is the cleanest, having the least outliers, we
will use it and a threshold of 200 as a maximum value over a time window
(minimum would also work as well) as input data to train a simple model
that determines if the engine is running. Obviously, this is not an example of
something that would require machine learning to detect since we can easily
know whether or not the machine is running, but an example of simple data
that can be analyzed with tinyML. This shows that a model can be trained
on collected accelerometer data. With data this simple it is also hard to
know if the model is overfitted since it is so simple.

45
1.0

0.8

0.6

0.4

0.2

0.0

0 2500 5000 7500 10000 12500 15000 17500 20000

Figure 20: Plot of which windows the acceleration has been higher than the
threshold and the engine is assumed to be running. Window length of 1000.

Since my computer has limited resources I could not train any more soph-
isticated models. Since the computer lacked sufficient RAM and the limited
computing performance made iterating a very long process, I settled for Lin-
ear Regression since it should be able to mimic what we intuitively have done
by choosing a threshold.
Code used for data formatting, training and prediction:

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

#Importing data
path='data/20200513_20200609/raw_csv/ME1_SU1_csv/acceleration_0*.csv'
df=pd.concat(map(pd.read_csv,sorted(glob.glob(path))),ignore_index=True)

#blocksize for creating training data

blocksize=1000

blocks= int(len(df.y)/blocksize)
ymax=[0]*blocks

#Finding the largest amplitude in a block

for x in range (blocks):
ymax[x]=np.max(df.y[x*blocksize:x*blocksize+blocksize-1])

46
#Checking if max amplitude in the block is over the threshold
ymaxbin=[False]*blocks

for a in range (len(ymax)):

if ymax[a] > 200:
ymaxbin[a]=True

#Interpolating the data to be the same length as to start with

ymaxlong=[]
for a in ymaxbin:
if a==False:
#for b in range (blocksize):
ymaxlong += [False]*blocksize
else:
#for b in range (blocksize):
ymaxlong += [True]*blocksize

#Model input data

datax=np.array(df.y).reshape(-1,1)

#Model output training data

datay= ymaxlong

#Fitting the data to the model

reg = LinearRegression().fit(datax, datay)

#Doing an output prediction on the Input data

predicted=reg.predict(datax)

#Averaging the output over 10 samples to get rid of erroneous data samples
n=10
avgResult = np.average(predicted.reshape(-1, n), axis=1)

#Rounding the data to binary since we only have two classes

predictedbin=avgResult.round()

Below is the plot of the predicted output, we can see that although the
general shape of the wanted output is there the output varies significantly
even though the output state should be stable for long periods.

47
1.0

0.8

0.6

0.4

0.2

0.0

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

1e6

Figure 21: Prediction of when the engine is running plotted.

Even though the problem seems simple we can see that analysing se-
quential data can be challenging and a simple Linear regression is often not
enough and some more complicated algorithms need to be used. Here we
can also see that for the model’s training, we still need a significant amount
of compute performance even though it can be possible to run inference on
minimal hardware. With both this example and the TinyML Wand example,
we can see that analysing sequential data can be pretty challenging.

48
8 Conclusion
With the vast amount of data collected today and in the future, it is possible
to see that there is plenty that can be done with the data with the help
of Machine Learning. Both on more conventional computers with operating
systems and perhaps accelerators as well as on the Edge and in some cases
on the Edge on Microcontrollers. When it comes to the way the analysis
is to be done, as can be seen in this thesis, there is a vast and evergrowing
way of doing things based on new research and implementations of ways to
compute the data:

• With post-processing, all the processing is done afterwards.

• With a model-based approach, a model is trained with the use of train-

ing data and then deployed to do inference on data gathered in real-
time.

• In the future we might also have models that are trained in real-time
as data is gathered.

However, as can also be seen in this thesis it is important that the quality
of the data is good, both:

• The quality of the data itself, meaning accuracy and precision.

• That the right kind of data is gathered. This of course requires planning
ahead in regard to what the possible end uses are for the gathered
data. For example, if the problem to be solved changes after the data
is already gathered it might be challenging to utilize the data.

For these reasons among others, good quality data is quite valuable. So
gathering as much and diverse data as possible can often be of great value as
long as it is done ethically. As data needs to be gathered from the real world,
the process can not be as agile as other parts of computer engineering.
As can be seen in the thesis as well, TinyML as well as edge computing
might not always be the right choice for applying machine learning in a
system. If the processing needs a holistic view of data gathered in the system
a more centralized approach is better. Similarly, if only small amounts of data
are generated and sufficient uplinks are in place a more centralized approach
might be preferred due to simplicity.
In regard to the analysis, I feel that I did not manage to get any truly
meaningful output. However, I think that the task of maintenance predict-
ability would be better to be done with a more diverse set of data from the

49
engines not just acceleration data. Labelled data of when an engine is not
running optimally or about to break would also be very valuable.
So to conclude, when planning to do data analysis it is important to figure
out what the root requirements are and based on those select the right kind
of processing to be done as well as where it is to be done.

50
9 Summary in Swedish
Titel: Tillämplighet av TinyML för förutsägbart underhåll.

9.1 Inledning
I dagens värld har vi en ökande mängd apparater som många samlar in data
i någon form som sedan ofta används för analys av systemet eller någon sorts
respons, endera reglering eller styrning. När datamängderna ökar explosion-
sartat kan det uppstå problem med hur data hanteras, lagras och behandlas.
För att hjälpa oss hantera all den genererade datan kan vi planera våra
system på olika sätt beroende på kraven för systemen. Denna avhandling
handlar om huruvida det är möjligt att behandla den genererade datan med
hjälp av kantberäkning på mikrokontroller, specifikt maskininlärning, i form
av TinyML.

9.2 Maskininlärning
Maskininlärning är ett mycket omdiskuterat område i dagens värld eftersom
vi har väldiga mängder med data samlade från olika sorters system. Då vi be-
handlar denna data kan vi inte alltid intuitivt veta hur olika variabler i datan
relaterar till varandra, men med hjälp av maskininlärning kan vi med algor-
itmer behandla data för att upptäcka korrelationer eller bygga upp modeller
som beskriver beteendet hos systemet eller systemen där datat är insamlat.
Dessa korrelationer och modeller kan sedan användas för att analysera till
exempel prestandan eller hälsan av systemet eller noder i systemet, eller för
att förutspå beteendet hos systemet eller noder i systemet.

9.3 Identifiering av avvikelser

Som namnet antyder används anomaliupptäckningsalgoritmer för att hitta
olika slags anomalier i en datamängd. Det är värdefullt att hitta anomalin i
datan för anomalin kan endera betyda att en del av sensordatat är felaktigt
eller att en del av systemet endera är söndrig eller håller på att gå sönder.
Anomalier kan förekomma i olika former beroende på beteendet hos systemet
som övervakas. Ett anomali kan t.ex. upptäckas genom att:

• En datapunkt statistiskt sett ligger så mycket utanför det förväntade

datastorleksområdet att värdet inte kan vara korrekt.

• En datapunkt förekommer med ett värde av avvikande storlek.

51
• I vissa system kan två eller flera variabler vara extremt korrelerade och
ifall variablerna i ett eller flera sampel inte följer korrelationen kan vi
misstänka att ett anomali förekommit.

9.4 Kantberäkning
I kantberäkning görs beräkningar på den så kallade kanten av systemet, alltså
på de apparaterna där datan genereras, eller fysiskt relativt nära dem. Detta
kan ge oss följande fördelar:

• Minskar på mängden rå data som behöver skickas för att produceras.

• Mindre behov av central beräkningskraft.

• Responstiden minskar ifall vi på noderna behöver agera på den lokalt
samlade datan.

Beroende på arkitekturen av systemet kan det finnas en betydande mängd

beräkningskraft nära kanten av systemet som i ett konventionellt system
förblir oanvänd. Dessa processorer kan endera vara i form av mikrokon-
trollrar eller mera konventionella datorer (x86) med en betydande prestanda.
Beräkningsbehovet varierar för olika system beroende på vad vi vill härleda
ur datan. Är beräkningen beroende av data från hela systemet är det oftast
inte lönsamt att göra beräkningen på noder nära kanten. Om beräkningen
däremot endast är beroende av data från noden själv eller noder i direkt
närhet så kan vi få betydande fördelar av att göra beräkningen där.

9.5 Maskininlärning på mikrokontroller - TinyML

TinyML är ett koncept att köra maskininlärningsmodeller direkt på mik-
rokontrollrar som samlar data. Modellerna är skapade på mera konvention-
ella datorer, endera servrar eller arbetsdatorer. Jämfört med konventionell
behandling av data är det svårare att göra alla beräkningar på mikrokon-
trollrar eftersom de har begränsade resurser:

• Processorprestandan är flera storleksordningar mindre än en vanlig pro-

cessor (x86).

• Lagringskapaciteten är begränsad, både arbetsminne (RAM), program-

minne (ROM flash) och lagringsminne (non-volatile memory).

52
• Energianvändningen kan också vara begränsad på grund av läget där
noden måste befinna sig. Detta betyder att vi inte alltid har en “oänd-
lig” mängd med ström från elnätet utan noderna kanske är batterid-
rivna, ibland med solceller och ibland utan. Detta i sin tur innebär att
noderna måste vara väldigt energisnåla.

Men vi har också fördelar med att göra databehandlingen på kanten på
mikrokontrollrar:

• Behovet att skicka stora mängder data till en central server minskar
drastiskt.

• Eftersom vi redan behöver ha en mikrokontroller i kant noden för

att koppla sensorerna till sparar vi resurser att använda beräkning-
skraften som finns i noderna. vilket betyder att vi behöver mindre
processorkraft i servrar.

9.6 Analys
Datan som analyserades är data samlad från en bilfärja som färdades i Bot-
tenviken mellan Vasa i Finland och Umeå i Sverige. I båtens maskinrum
och på motorerna och deras fästen placerades accelerometersensorer. Data
från dessa sensorer analyserades. För analysen användes ett slumpmässigt
tidsintervall. Analysen gjordes med linjär regression för att undersöka hur
bra algoritmen klarar av klassificering av data.

1.0

0.8

0.6

0.4

0.2

0.0

0 2500 5000 7500 10000 12500 15000 17500 20000

Figure 22: Graf av träningsdata där motorn har antagits vara igång. Datat
är delat upp i 1000 datapunkter långa fönster.

53
1.0

0.8

0.6

0.4

0.2

0.0

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

1e6

Figure 23: Modellens uppskattning när motorn är igång.

Med analysen kan det ses att en så simpel algoritm inte är så bra på
klassificering av sekventiella data. Formen av det önskade resultaten ses
tydligt i uppskattningen, men under perioderna motorn konstant är igång
hoppar uppskattningen extremt ofta till att motorn är av.

9.7 Sammanfattning
Som med de flesta problemen så finns det inte alltid en lösning som passar
perfekt för att lösa alla problem. TinyML är alltså ett verktyg till i verktygs-
backen för att kunna lösa problem var maskin inlärning kan vara lösningen,
men man måste fortfarande fundera på vad målet är med lösningen. TinyML
är alltså ett bra verktyg när vi vill minska på nätverksbehovet och latensen
och maximera användningen av mikrokontrollrar som vi redan har i bruk.
Samtidigt ser vi också att TinyML inte kan bota att vi fortfarande behöver
data av hög kvalitet för att kunna utnyttja den till högsta grad.

54
References
[1] Bruno Stecanella. Support Vector Machines (SVM) Algorithm Explained .
https : / / monkeylearn . com / blog / introduction - to - support -
vector-machines-svm/. Accessed 2022-05.
[2] Rebecca Bevans. Simple Linear Regression — An Easy Introduction
Examples. https://www.scribbr.com/statistics/simple-linear-
regression/. Accessed 2022-05.
[3] Pranoy Radhakrishnan. “What are Hyperparameters ? and How to
tune the Hyperparameters in a Deep Neural Network?” In: (2017). url:
https://towardsdatascience.com/what- are- hyperparameters-
and - how - to - tune - the - hyperparameters - in - a - deep - neural -
network-d0604917584a.
[4] What is anomaly detection? https://developer.ibm.com/learningpaths/
get-started-anomaly-detection-api/what-is-anomaly-detection/.
Accessed 2022-04.
[5] Sahil Garg. Algorithm selection for Anomaly Detection. https : / /
medium.com/analytics-vidhya/algorithm-selection-for-anomaly-
detection-ef193fd0d6d1. Accessed 2022-04.
[6] What is edge computing? https://www.ibm.com/cloud/what- is-
edge-computing. Accessed 2021-06.
[7] What is Edge Computing? https://www.intel.com/content/www/
us/en/edge-computing/what-is-edge-computing.html. Accessed
2021-06.
[8] Pete Warden and Daniel Situnayake. TinyML: Machine learning with
TENSORFLOW lite on Arduino and ultra-low power microcontrollers.
O’Reilly Media Inc., 2020.
[9] FlatBuffers white paper . https://google.github.io/flatbuffers/
flatbuffers_white_paper.html. Accessed 2021-10.
[10] FlatBuffers. https://google.github.io/flatbuffers/. Accessed
2021-10.
[11] https : / / github . com / tensorflow / tflite - micro / tree / main /
tensorflow/lite/micro/examples/magic_wand.
[12] https://github.com/tensorflow/tflite-micro/commit/bef8fe8bc6183cc4e1ce852579

55
[13] Andrei-Raoul Morariu, Wictor Lund, Andreas Lundell et al. “Edge-
based Vibration Monitoring of Marine Vessel Engines”. English. In:
12th Symposium on High-Performance Marine Vehicles. Ed. by Ber-
tram Volker. Symposium on High-Performance Marine Vehicles : HIPER
; Conference date: 12-10-2020 Through 14-10-2020. Germany: Technis-
che Universität Hamburg-Harburg, Oct. 2020, pp. 239–250.

S. Rajasekaran - Neural Networks, Fuzzy Logic and Genetic Algorithms-PHI Learning Private Limited (2004)
57% (7)
S. Rajasekaran - Neural Networks, Fuzzy Logic and Genetic Algorithms-PHI Learning Private Limited (2004)
965 pages
Predictive Machine Maintenance Using Tiny ML
No ratings yet
Predictive Machine Maintenance Using Tiny ML
6 pages
Taeho Jo - Deep Learning Foundations-Springer (2023) (Z-Lib - Io)
No ratings yet
Taeho Jo - Deep Learning Foundations-Springer (2023) (Z-Lib - Io)
433 pages
Intro To Machine Learning
100% (1)
Intro To Machine Learning
250 pages
Machine Learning On Commodity Tiny Devices Theory 231118 130425
100% (1)
Machine Learning On Commodity Tiny Devices Theory 231118 130425
268 pages
Ot All MCQ Question Bank Updated Rev02
No ratings yet
Ot All MCQ Question Bank Updated Rev02
196 pages
Edge AI and TinyML
No ratings yet
Edge AI and TinyML
4 pages
Erradonea2020 - Digital Twin For Maintenance
No ratings yet
Erradonea2020 - Digital Twin For Maintenance
14 pages
Machine Learning Algorithms For Wireless Sensor Networksa Survey
100% (1)
Machine Learning Algorithms For Wireless Sensor Networksa Survey
25 pages
PHD Thesis Fouad Sakr
100% (1)
PHD Thesis Fouad Sakr
170 pages
Lecture 1 - Intro
No ratings yet
Lecture 1 - Intro
57 pages
Unit 4
No ratings yet
Unit 4
42 pages
Neural Networks For Short-Term Load Forecasting
No ratings yet
Neural Networks For Short-Term Load Forecasting
12 pages
Tiny Machine Learning
No ratings yet
Tiny Machine Learning
33 pages
Tiny Machine Learning
No ratings yet
Tiny Machine Learning
7 pages
Raj Emmanuel
No ratings yet
Raj Emmanuel
85 pages
Report On Quadraped Robot's Leg Analysis
No ratings yet
Report On Quadraped Robot's Leg Analysis
58 pages
Issue 5
No ratings yet
Issue 5
25 pages
BE Computer Engineering Syllabus 2019 Course
No ratings yet
BE Computer Engineering Syllabus 2019 Course
3 pages
Cabs Availability Prediction Using Deep Learning: Project Member
No ratings yet
Cabs Availability Prediction Using Deep Learning: Project Member
58 pages
Livro TinyML
No ratings yet
Livro TinyML
87 pages
Programming Assignment Page 1 of 11
No ratings yet
Programming Assignment Page 1 of 11
11 pages
Fyp 234
No ratings yet
Fyp 234
37 pages
MAJDANI SHABESTARI 2020 Automated Anomaly Recognition in Real Time
No ratings yet
MAJDANI SHABESTARI 2020 Automated Anomaly Recognition in Real Time
181 pages
A Review On TinyML-State of The Art Prospects
No ratings yet
A Review On TinyML-State of The Art Prospects
29 pages
Master Thesis Mattias Wiberg Jonas Lauri
No ratings yet
Master Thesis Mattias Wiberg Jonas Lauri
75 pages
Iot CP and A CH 4
No ratings yet
Iot CP and A CH 4
18 pages
The Application of Big Data Analysis and Machine Learning For Kick
No ratings yet
The Application of Big Data Analysis and Machine Learning For Kick
128 pages
Edge Machine Learning: Enabling Smart Internet of Things Applications
No ratings yet
Edge Machine Learning: Enabling Smart Internet of Things Applications
17 pages
GROUP 66 Synopsis Copy Aryan
No ratings yet
GROUP 66 Synopsis Copy Aryan
31 pages
Machine Learning at Resource Constraint Edge Device Using Bonsai Algorithm
No ratings yet
Machine Learning at Resource Constraint Edge Device Using Bonsai Algorithm
7 pages
2023thesefombonne de GalateauA
No ratings yet
2023thesefombonne de GalateauA
156 pages
Electronics 13 04768
No ratings yet
Electronics 13 04768
15 pages
TinyML Algorithms For Big Data
No ratings yet
TinyML Algorithms For Big Data
30 pages
Subhadip Sinha
No ratings yet
Subhadip Sinha
9 pages
Updated - Mini - Project - II - Predictive Maintenance Analysis For Industrial Machinary
No ratings yet
Updated - Mini - Project - II - Predictive Maintenance Analysis For Industrial Machinary
39 pages
An Ultra-Low Power TinyML System For Real-Time Visual Processing at Edge
No ratings yet
An Ultra-Low Power TinyML System For Real-Time Visual Processing at Edge
6 pages
Computing Tools PDF
No ratings yet
Computing Tools PDF
24 pages
A Review On The Emerging Technology of TinyML
No ratings yet
A Review On The Emerging Technology of TinyML
37 pages
A Comprehensive Survey On TinyML
No ratings yet
A Comprehensive Survey On TinyML
31 pages
Hardware-Friendly User-Specific Machine Learning For Edge Devices
No ratings yet
Hardware-Friendly User-Specific Machine Learning For Edge Devices
29 pages
AIfES A Next-Generation Edge AI Framework
No ratings yet
AIfES A Next-Generation Edge AI Framework
16 pages
Intelligence at The Extreme Edge A Survey of Tinyml
No ratings yet
Intelligence at The Extreme Edge A Survey of Tinyml
31 pages
Tiny ML-Progress and Futures
No ratings yet
Tiny ML-Progress and Futures
24 pages
A Holistic Review of The TinyML Stack For Predictive Maintenance
No ratings yet
A Holistic Review of The TinyML Stack For Predictive Maintenance
22 pages
Machine Learning in Intelligent Systems in Cooperation With Iot: A Synergistic Approach To Autonomous Operations
No ratings yet
Machine Learning in Intelligent Systems in Cooperation With Iot: A Synergistic Approach To Autonomous Operations
3 pages
Tinyreptile: Tinyml With Federated Meta-Learning: 1 Haoyu Ren 2 Darko Anicic 3 Thomas A. Runkler
No ratings yet
Tinyreptile: Tinyml With Federated Meta-Learning: 1 Haoyu Ren 2 Darko Anicic 3 Thomas A. Runkler
9 pages
PDM FYP Final Report
No ratings yet
PDM FYP Final Report
72 pages
BringingMachineLearningtotheDeepestIoTEdgewithTinyMLas A Service - Newmd
No ratings yet
BringingMachineLearningtotheDeepestIoTEdgewithTinyMLas A Service - Newmd
4 pages
Tiny Machine Learning
No ratings yet
Tiny Machine Learning
39 pages
Artificial Intelligence and Edge Computing
No ratings yet
Artificial Intelligence and Edge Computing
6 pages
Unit 2
No ratings yet
Unit 2
100 pages
Wall Journal
No ratings yet
Wall Journal
8 pages
Implementation of Smart Security System in Agriculture Fields Using Embedded Mac
No ratings yet
Implementation of Smart Security System in Agriculture Fields Using Embedded Mac
6 pages
Tinyml For Edge Networks: Challenges and Future Directions
No ratings yet
Tinyml For Edge Networks: Challenges and Future Directions
5 pages
A Comprehensive Survey On TinyML
No ratings yet
A Comprehensive Survey On TinyML
31 pages
A Review On Machine Learning in Iot Devices: Imopishak Thingom, N. Basanta Singh
No ratings yet
A Review On Machine Learning in Iot Devices: Imopishak Thingom, N. Basanta Singh
5 pages
TInyML Research
No ratings yet
TInyML Research
11 pages
A Survey On Deep Learning Methods and Tools in Image Processing
No ratings yet
A Survey On Deep Learning Methods and Tools in Image Processing
6 pages
TCE - SS - CFP - Proposal - TinyML For Empowering Low-Power IoT Edge Consumer Devices
No ratings yet
TCE - SS - CFP - Proposal - TinyML For Empowering Low-Power IoT Edge Consumer Devices
2 pages
From Tiny Machine Learning To Tiny Deep Learning: A Survey
No ratings yet
From Tiny Machine Learning To Tiny Deep Learning: A Survey
38 pages
TML Intro - Merged
No ratings yet
TML Intro - Merged
25 pages
Documento Completo
No ratings yet
Documento Completo
5 pages
Electronics 13 03562
No ratings yet
Electronics 13 03562
19 pages
Future Directions in TinyML - Emerging Trends and Innovations
No ratings yet
Future Directions in TinyML - Emerging Trends and Innovations
21 pages
Journal of Sensors - 2022 - Immonen - Tiny Machine Learning For Resource Constrained Microcontrollers
No ratings yet
Journal of Sensors - 2022 - Immonen - Tiny Machine Learning For Resource Constrained Microcontrollers
11 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
Soft Computing Unit I
No ratings yet
Soft Computing Unit I
182 pages
ML - Viva QnA - Doubtly - in
No ratings yet
ML - Viva QnA - Doubtly - in
14 pages
Masters Thesis Topics Industrial Engineering
100% (2)
Masters Thesis Topics Industrial Engineering
8 pages
Thesis - Deep Learning For Detection
No ratings yet
Thesis - Deep Learning For Detection
68 pages
6.1 DeepFFNets
No ratings yet
6.1 DeepFFNets
47 pages
Thesis - AIS Ship Voyage Patterns
No ratings yet
Thesis - AIS Ship Voyage Patterns
84 pages
DL Practical
No ratings yet
DL Practical
25 pages
Deep Learning in Drug Discovery
No ratings yet
Deep Learning in Drug Discovery
12 pages
Deep Learning Book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville
No ratings yet
Deep Learning Book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville
38 pages
UNIT V Compression and Recognition
No ratings yet
UNIT V Compression and Recognition
97 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
16 pages
Upload 2
No ratings yet
Upload 2
48 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
16 pages
Lecture8 NN 1
No ratings yet
Lecture8 NN 1
58 pages
47 Best Public Business Schools For A Full-Time MBA
No ratings yet
47 Best Public Business Schools For A Full-Time MBA
4 pages
Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks
No ratings yet
Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks
13 pages
Deep Neural Network Framework Based On Backward Stochastic Differential Equations For Pricing and Hedging American Options in High Dimensions
No ratings yet
Deep Neural Network Framework Based On Backward Stochastic Differential Equations For Pricing and Hedging American Options in High Dimensions
35 pages
EMJMD Programme On The Engineering of Data-Intensive Intelligent Software Systems
No ratings yet
EMJMD Programme On The Engineering of Data-Intensive Intelligent Software Systems
2 pages
The Impacts of Artificial Intelligence Techniques in Augmentation of Cybersecurity: A Comprehensive Review
No ratings yet
The Impacts of Artificial Intelligence Techniques in Augmentation of Cybersecurity: A Comprehensive Review
19 pages
Extracting Supply Chain Maps
No ratings yet
Extracting Supply Chain Maps
17 pages
Carnegie Melon-Mba-Curriculum-22
No ratings yet
Carnegie Melon-Mba-Curriculum-22
2 pages
Neural Network - Test Questions
No ratings yet
Neural Network - Test Questions
9 pages
EMerald Digital
No ratings yet
EMerald Digital
16 pages
SB-915 I A-001 - SB-915 I B-001 - Checking of The Axial Clearance of Wastegate Shaft On Turbocharger For ROTAX Engine Type 915 I (Series)
No ratings yet
SB-915 I A-001 - SB-915 I B-001 - Checking of The Axial Clearance of Wastegate Shaft On Turbocharger For ROTAX Engine Type 915 I (Series)
10 pages
ASB-912 I-011r1 - ASB-915 I A-008R1 - Replacement of Fuel Pump Assy. For Engine Types 912 I and 915 I A (Series)
No ratings yet
ASB-912 I-011r1 - ASB-915 I A-008R1 - Replacement of Fuel Pump Assy. For Engine Types 912 I and 915 I A (Series)
8 pages
6 Options To Cut The Cost of An MBA
No ratings yet
6 Options To Cut The Cost of An MBA
6 pages
What Is A Perceptron?
No ratings yet
What Is A Perceptron?
1 page
Forecasting Foreign Exchange Rate Using Robust Laguerre Neural Network
No ratings yet
Forecasting Foreign Exchange Rate Using Robust Laguerre Neural Network
5 pages
Scholarships and Talent Promotion Organizations - Hannover University of Applied Sciences
No ratings yet
Scholarships and Talent Promotion Organizations - Hannover University of Applied Sciences
5 pages
CoDAS Project-101128281
No ratings yet
CoDAS Project-101128281
2 pages
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
From Everand
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
Matthew C. Smith
No ratings yet