0% found this document useful (0 votes)
58 views12 pages

Artificial Intelligence in The IoT Era A Review of Edge AI Hardware and Software

The document reviews the current landscape of Edge AI hardware and software, emphasizing the need for localized AI computation due to latency and privacy concerns associated with cloud-based solutions. It identifies key application areas for Edge AI, including security, healthcare, and mobile networks, while also discussing the challenges of deploying AI models on resource-constrained devices. The study categorizes various hardware platforms and software solutions available for Edge AI development, highlighting trends in neural network optimization and the importance of collaboration among Edge devices.

Uploaded by

suchibrata das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views12 pages

Artificial Intelligence in The IoT Era A Review of Edge AI Hardware and Software

The document reviews the current landscape of Edge AI hardware and software, emphasizing the need for localized AI computation due to latency and privacy concerns associated with cloud-based solutions. It identifies key application areas for Edge AI, including security, healthcare, and mobile networks, while also discussing the challenges of deploying AI models on resource-constrained devices. The study categorizes various hardware platforms and software solutions available for Edge AI development, highlighting trends in neural network optimization and the importance of collaboration among Edge devices.

Uploaded by

suchibrata das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

Artificial Intelligence in the IoT Era: A Review of


Edge AI Hardware and Software
Tuomo Sipola, Janne Alatalo, Tero Kokkonen, Mika Rantonen
JAMK University of Applied Sciences
Jyväskylä, Finland
{tuomo.sipola, janne.alatalo, tero.kokkonen, mika.rantonen}@jamk.fi

Abstract—The modern trend of moving artificial intelligence cannot be in the cloud behind the network latencies or bad
computation near to the origin of data sources has increased network connection, similarly uploading of the video recording
the demand for new hardware and software suitable for such with personal data to the cloud based servers raises privacy
environments. We carried out a scoping study to find the current
resources used when developing Edge AI applications. Due to the considerations [1]. The concept of fog computing, i.e., placing
nature of the topic, the research combined scientific sources with computation nodes below the cloud, between edge and the
product information and software project sources. The paper cloud [2], covers similar technologies in the same problem
is structured as follows. In the first part, Edge AI applications space.
are briefly discussed followed by hardware options and finally,
the software used to develop AI models is described. There are
However, even if the concept of Edge AI seems to be
various hardware products available, and we found as many as coherent, there are some known issues with it. As stated by
possible for this research to identify the best-known manufactur- Shi et al. [3], deploying complete AI models (such as deep
ers. We describe the devices in the following categories: artificial neural networks) to the Edge device is generally impracticable
intelligence accelerators and processors, field-programmable gate because of the hardware boundaries; the size of the model
arrays, system-on-a-chip devices, system-on-modules, and full
computers from development boards to servers. There seem to be
is too large or the computational requirements are too high.
three trends in Edge AI software development: neural network A potential implementation is to accomplish collaboration
optimization, mobile device software and microcontroller soft- between different Edge AI devices and solutions [3]. Bharwaj
ware. We discussed these emerging fields and how the special et al. [4] identify three challenges for Edge AI:
challenges of low power consumption and machine learning
computation are being taken into account. Our findings suggest 1) Computation-aware learning on IoT. Most of the IoT
that the Edge AI ecosystem is currently developing, and it has its devices are power and/or memory constrained and in that
own challenges to which vendors and developers are responding. sense the computation-aware compression of AI models
is required.
I. I NTRODUCTION 2) Data-independent model compression for learning from
small data. The original private data sets of big-data
In the last ten years, Artificial Intelligence (AI) solutions
models cannot be used for model compression.
have become common in several application areas. In par-
3) Communication-aware deployment of deep learning
ticular, Machine Learning (ML) based solutions are applied
models on multiple IoT devices. Distributing compu-
to solving a wide range of real-life problems. This variety
tation with IoT devices could be difficult because of
extends from analysing diseases based on healthcare imaging
limited communication resources.
to predicting energy consumption or detecting anomalous
intrusions in network traffic. These AI-based solutions com- As can be seen, IoT-based Edge AI devices produce distinct
monly require a large amount of computational capability, data. The analysed information must be exchanged between
which is usually achieved using cloud-based solutions relying collaborating Edge AI devices in order to achieve sufficient
on High Performance Computing (HPC) clusters. overall capability. Lin et al. [5] introduce blockchain-based
However, the rapidly increasing number of Internet of architecture for a knowledge market for trading the knowledge
Things (IoT) applications has also raised the number of of Edge AI devices. Security and privacy issues of Edge AI
devices and applications that are producing, collecting and should be considered thoroughly, as with any data processing
analysing data on the edge of the network. This has naturally systems. Sachdev presented security and privacy issues of
increased the interest in applying AI computation on the edge. Edge AI in digital marketing and concluded that one of
This concept of using AI near the devices that are producing the main challenges is how Edge AI can extensively be
data is called Edge AI. One of the earliest publications about implemented in that context [6]. Kumar et al. proved by
Edge AI [1] notices two requirements promoting the use of using the classical k-means algorithm in Edge AI concept the
Edge AI: (i) connection robustness and its latency and (ii) feasibility of maintaining privacy preservation of data with
privacy issues when uploading data to cloud-based servers. Edge AI processing [7].
Lee et al. present following examples of those issues: the There have been earlier reviews about Edge AI focusing on
AI calculation of self-driving cars must be immediate and different aspects of the emerging field. Wang et al. [8] present

ISSN 2305-7254

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

a survey of technologies related to Edge AI emphasising studies. As queries about the three research questions revealed
edge intelligence and intelligent edge. Edge intelligence is the information about use cases, hardware and software, the found
concept of deploying machine learning models to the devices articles and resources were included in the three categories
using those models on the edge. This is done in order to lower accordingly. For Edge AI applications, we queried Google
latency and make the applications more reliable. The concept Scholar and IEEE Xplore with the search phrases ”Edge AI”
of intelligent edge focuses on maintenance and management of and ”Edge AI application”. Our search resulted in several
edge devices. The intelligence via machine learning is used to articles about hardware platforms and those were included in
adaptively control the shared edge resources. The survey also the hardware platforms chapter. The goal here was to gain
introduces various applicable scenarios for both technologies. a general overview about different applications, so the most
Similar classification of the types of Edge AI is introduced by relevant and diverse ones were chosen.
Deng et al. [9], but instead of using the terms edge intelligence AI hardware platforms were searched in Google with the
and intelligent edge they are using the terms AI on Edge and phrase ”edge AI hardware platform”. The first 50 results were
AI for Edge. In addition of making the distinction between the evaluated. We excluded offerings that focused on providing
classes, the paper also reviews the state of the art and grand services or projects centered around data. This method is
challenges in both categories. Reuther et al. [10] have surveyed useful because this is the way most users would search for
machine learning accelerators. They present and categorise products. However, we must be aware that search engine op-
close to hundred chips and systems covering everything from timization for marketing purposes and Google’s own ranking
low power solutions to data center systems. Li and Liewig can skew the results. New devices and manufacturers were
have done a similar review [11]. The paper also lists some discovered when going through the found product descriptions
future trends that AI accelerators might implement in the and results from the software and applications searches. Those
future. Crespo [12] has collected a list of hardware, software were also included in the study where needed. Especially
and other resources that are related to Edge AI to a GitHub Crespo’s list was found useful [12].
project where community members can contribute to share For the Edge AI software section we settled on three distinct
knowledge of the topic. Merenda et al. [13] have carried out a categories that clearly can be placed under the Edge AI
literature review on the topic of running Edge AI on resource software term. These categories are neural network model
constrained devices. They review different algorithms, hard- optimization, Edge AI on mobile devices and Edge AI on
ware, infrastructure architectures, wireless standards, privacy embedded devices. For the neural network optimization cate-
issues and solutions, and edge training solutions that can be gory, we studied recent review publications about the different
used with these devices. Furthermore, the authors performed optimization methods and then reviewed the two most popular
a test deployment of a convolutional neural network model to neural network frameworks, TensorFlow and PyTorch for sup-
a real world microcontroller system. Ray [14] has carried out port to these methods. For Edge AI on mobile devices section,
an extensive review of machine learning state-of-the-art and we reviewed the two most popular mobile device operating
prospects on embedded devices (TinyML). systems, Android and iOS, for machine learning support. The
This scoping study aims to create an overview of the Edge final category, Edge AI on embedded devices, turned out to
AI ecosystem and provide answers to the following questions: be problematic. Finding the software products that belong to
1) What application areas can be identified for Edge AI? this category was a challenging task. There does not seem to
2) What Edge AI hardware platforms exist? be a single search term that reliably finds these projects. The
3) What Edge AI software packages exist? problem might be that the terminology and workflows have
not yet properly settled for the Edge AI field of study. We
The first question is meant to be a cursory glance at the
tried to use search terms such as TinyML software, IoT AI
possibilities and latest trends in applications. The hardware
software, embedded devices AI software and microcontrollers
and software sections provide a fresh look at the tools available
AI software, but the results were saturated by blogspam like
for Edge AI development.
content. Finally, the best sources for actual software projects
II. M ETHODOLOGY that belong to this category were community collected lists
on open forums [12], [17]. After finding projects that way,
This research is structured as a scoping study that describes
some additional projects were discovered by searching survey
and summarises an emerging field [15]. We use the stages of
and benchmarking publications and blog posts with those
scoping study described by Arksey and O’Malley [16]:
project names. Often the projects were compared to some other
1) Identify the question, projects that were not listed yet. After the projects were found,
2) Identify relevant studies and product descriptions, their features were evaluated using their publicly available
3) Select relevant studies and product descriptions, documentation and compared with each other.
4) Chart the data,
5) Collate, summarize, report the results. III. E DGE AI APPLICATIONS
Since the topic we are interested in heavily depends on As with traditional AI solutions, there exists a wide range
hardware products and software packages, we have changed of applications in the Edge AI world. Because the Edge AI
the process to include vendor marketing material in addition to concept is relatively new, the published research papers started

---------------------------------------------------------------------------- 321 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

appearing in 2018 and most of them after 2019. The research Energy efficiency is a constant concern with Edge AI, and
of published studies on Edge AI applications identifies five there are developments in this area, such as Levisse et al. with
main categories of applications: Security, Mobile networks, their functionality enhanced memories [37]. Liu et al. propose
Healthcare, Voice and Image Analysis and Frameworks. hybrid parallelism, which makes hierarchical training of AI
AntiConcealer is an Edge AI approach for detecting ad- models for Edge AI situations efficient [38].
versary concealed behaviors in the IoT [18]. For the security
solutions, Edge AI is also used for anomaly detection in the A. AI acceleration units
advanced metering infrastructures [19], while Nawaz et al. in- Special-purpose acceleration units can either be used as
troduce Ethereum blockchain based solution for analysing the additional processors in any electronic device or as machine
data and tracking the parties accessing that analysis data [20]. learning co-processors in devices that are designed to include
Examples of Edge AI solutions for mobile communication such capabilities in addition to traditional processing power
and networking are the paper about learning method to support and connectivity. AI acceleration units are fast at executing
mobile target tracking in the edge platform [21] and another vector and tensor computation and have optimal pipelines for
one introducing a resource allocation scheme for 6G [22]. machine learning operations, usually neural network methods.
In the healthcare domain Edge AI is for example applied for Unfortunately, it is difficult to find meaningful details about
detecting diabetic retinopathy [23]. Queralta et al. proposed many of these devices. Intel Neural Compute Engine is an
an architecture for health monitoring [24]. Edge AI is used accelerator for deep neural networks. It supports native FP
for predicting diseases such as respiratory diseases [25] and 16 floating point and 8-bit fixed point data types and can
chronic obstructive pulmonary disease [26]. be used to deploy neural networks in Caffe and TensorFlow
As in the traditional AI applications, Edge AI applications formats [39]. MediaTek’s AI Processing Unit (APU) is an AI
are heavily used for voice and image detection and analysis. accelerator with multimedia features. APU lists TensorFlow,
Shen et al. [27] introduce Edge AI based human head detection TensorFlow Lite, Caffe and others as supported neural network
algorithm, and Gamanayake et al. propose an Edge AI based formats. APU can perform 8-bit and 16-bit integer and 16-
method for image pruning [28]. Edge AI based solution is bit floating point calculations. It supports Android Neural
implemented for acoustic classification to be deployed in the Networks API (NNAPI) and a custom API [40], [41]. Google
autonomous cars [29]. Miyata et al. [30] illustrate Edge AI Edge TPU is an application specific integrated circuit (ASIC)
based mobile robot including voice and object recognition. designed to run TensorFlow Lite models. [42]. The NVIDIA
Application for tangible real-world problem solving is the An Deep Learning Accelerator (NVDLA) is built specifically for
Edge AI based apple detection solution has been created to neural network operations. Its processors map to the corre-
count apples and estimate their sizes [31]. sponding mathematical operations used during deep learning.
There are also different frameworks published for Edge AI It supports a wide range of data types [43]. The Gyrfalcon
solutions, for example NeuroPilot, a cross-platform framework Matrix Processing Unit (MPE) is built to compute matrix
for Edge AI [32] and an Edge AI framework for telemetry operations related to neural networks [44]. Mythic has created
collection and utilization, evaluating both graphics card (GPU) an analog matrix processor called M1076 Mythic AMP, which
and field-programmable gate array (FPGA) platforms. [33] uses the Mythic Analog Compute Engine (ACE). Supported
data formats are 4-bit, 8-bit and 16-bit integers, and PyTorch,
IV. E DGE AI HARDWARE PLATFORMS Caffe and TensorFlow models can be used [45]. Syntiant has
The concept of Edge AI is tied to the idea of placing com- created a product line of Neural Decision Processors in order
puting power physically near the data source. Any desktop or to create faster possibilities for neural network solutions, in-
server rack computer could serve as an edge device. However, cluding speech recognition [46], [47], sensor applications [48],
many environments are not optimal for such devices. Their size [49] and vision [50]. Hailo offers an AI processor that supports
and power consumption are also a major concern. For these 8- and 16-bit numeric presentations and TensorFlow and
reasons, specific Edge AI devices have been designed. Their ONNX for software [51].
size and wireless connectivity make them easily attachable to
industrial environments. Limited power consumption is also B. Field-programmable gate arrays
essential when many devices are deployed at once. Moreover, One trend in Edge AI devices is to employ a field-
the need for specific mathematical capabilities has given rise programmable gate array (FPGA) to build a processor suitable
to the AI accelerator modules. for the specific task of using machine learning methods.
Developments in the Edge AI ecosystem drive the devices to Because FPGAs allow great flexibility in what the processor
be more efficient. Benchmarking hardware platforms has also does, they are very useful in building AI accelerators. Intel has
interested researchers from computing and power consumption produced FPGAs whose applications cover Edge AI: MAX V
points of view. Baller et al. measured five edge devices and CPLD [52], Cyclone 10 LP FPGA [53] and Cyclone 10 GX
give their recommendations for best performance in continu- FPGA [54]. For example, a CPU intended for IoT and Edge AI
ous and sporadic scenarios [34]. Operating AI inference in has been developed using the MAX 10 FPGA [55]. There have
industrial conditions could be made more robust by using also been, e.g., frameworks using a FPGA for accelerating
magnetoresistive random access memory (MRAM) [35], [36] machine learning in edge environments [56]

---------------------------------------------------------------------------- 322 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

C. System-on-a-chip and system-on-module devices TX2 NX [88], Jetson TX2 4GB [89], Jetson TX2 [90] and
Intel’s Movidius Myriad X Vision Processing Unit is a video Jetson TX2i [91]. As discussed earlier, NVIDIA’s NVDLA is
processor with neural network inference capabilities. It has 16 a deep learning accelerator. The use of a separate process-
cores and a dedicated on-chip Neural Compute Engine and can ing unit for neural network calculations releases the GPU
be used with up to 8 high-definition cameras [39]. Intel has for multimedia tasks. There are various Jetson models that
also produced a USB device based on the Movidius Myriad X use this technology: Jetson Xavier NX 16GB [92], Jetson
unit [57] and vision accelerators for edge applications [58]. Xavier NX [93], Jetson AGX Xavier 64GB [94], Jetson AGX
Systems such as UP Squared 6000 use Movidius Myriad X as Xavier [95], Jetson AGX Xavier Industrial [96], Jetson Orin
an optional visual processing unit [59], Luxonis DepthAI [60] NX [97] and Jetson AGX Orin [98]. Both the GPU-based
and Luxonis megaAI [61]. HiSilicon’s Kirin 970 is a processor and NVDLA-based devices have developer kits available:
for AI computing. It has a dedicated NPU for AI and features Jetson Nano Developer Kit [99], Jetson Nano 2GB Developer
aimed at solving computer vision and audio tasks. It also has Kit [100], Jetson Nano Xavier NX Developer Kit [101],
connectivity in the cellular network using an LTE modem [62]. Jetson AGX Xavier Developer Kit [102] and Jetson AGX Orin
Qualcomm’s Snapdragon 855+/860 is aimed at photography Developer Kit [103]. These kits can be used for prototyping
and gaming. However, the on-device AI engine can perform and testing before moving on to the production versions.
vector and tensor acceleration. It has an LTE modem for F. Gyrfalcon MPE
cellular connectivity along with Wi-Fi, Bluetooth and near
field communication (NFC) [63], MediaTek’s Helio P90 is Gyrfalcon produces its MPE-based devices for Edge AI.
also geared towards imaging, photography, and gaming and Their products cover a wide area of hardware from MPE
features cellular connectivity (LTE), Wi-Fi and Bluetooth. The processors to servers that use that technology. Lightspeeur
AI system is marketed for image processing [64]. MediaTek 2801S Neural Accelerator can be deployed as a USB dongle
also has AIoT Chipset Platforms specifically for IoT and Edge or as an embedded device. It supports TensorFlow, Caffe
AI casesincluding displays [65], voice recognition [66], audio and PyTorch [44]. Lightspeeur 5801S Neural Accelerator is
and video processing [67] and AI vision [68], although only the more efficient (operations/Watt) model for consumer edge
the last two have a dedicated AI processor. Other devices devices [104]. Lightspeeur 2803S Neural Accelerator provides
using the APU units include Helio P95 [69], the Dimensity even more computational power [105]. Lacelli Edge Inferenc-
1000 series [70] and Dimensity 9000 [71]. MediaTek has also ing Server AI Acceleration Subsystem uses Lightspeeur 2803S
released a short paper about their Edge AI solutions [72]. chips on M.2 cards [106]. Gainboard 2801 provides MPE
Nowadays, Rock Chip offers two processor models for Edge capabilities via the PCIe connector [107]. Gainboard 2803
AI. These processors are aimed at image and voice processing, does the same for the other neural accelerator [108]. Janux
especially for mobile devices [73], [74] Kendryte K210 is a G31 AI Server is an AI server with 32 MPE cards [109].
chip designed for face recognition. It uses the TinyYOLO G. Mythic
object detection neural network [75]. JeVois-A33 [76] is an
open-source camera with computer vision AI capabilities. Mythic’s unique perspective is using an analog engine to run
JeVois-Pro [77] has in internal neural processing unit but can its M1076 processor [45]. MP10304 Quad-AMP PCIe Card
also be updated with Coral and Movidius Myriad X units. has four processors [110]. MM1076 M.2 M key card makes
one processor usable via the M.2 bus [111]. ME1076 M.2 A+E
D. Coral key card offers it in smaller size and bandwidth [112]. There
Google’s Coral Accelerator Module [42] is a solderable is also an evaluation system MNS1076 AMP [113].
module that contains the tensor processing unit Edge TPU. It
H. Development boards
is also offered using various connectors: Coral USB Acceler-
ator [78], Coral M.2 Accelerator [79], Coral M.2 Accelerator Beagle Bone AI is an open-source device featuring TI
with Dual Edge TPU [80] and Coral Mini PCIe Accelera- C66x digital signal processor (DSP) cores and TI embedded
tor [81]. The Coral Dev Board Mini [82] can be used to vision engines (EVE). It is marketed as focusing on everyday
develop and test applications to be used with the accelerator automation, including industrial applications. It has USB and
itself. Coral System-on-Module [83] is an integrated system Ethernet connectivity, along with Wi-Fi and Bluetooth [114].
that includes the Edge TPU accelerator. The module is meant OpenMV Cam is a microcontroller board for machine vision.
for deployment into production environments. It has a devel- It has a 480p resolution camera and a USB connection. This
opment board counterpart called Coral Dev Board [84]. There small device can run TensorFlow Lite models in addition to
are also camera [85] and sensor add-ons available [86]. multiple basic machine vision tasks [115]. SparkFun Edge
Development Board Apollo3 Blue is a low-power board
E. Jetson that can run TensorFlow Lite models [116]. Syntiant’s Tiny
NVIDIA has produced devices for edge computing using Machine Learning Development Board uses their NDP101
graphical processing units, which can be used for the vector Neural Decision Processor. [117] STMicroelectronics’ STM32
calculations needed in machine learning. These Jetson models microcontroller units can be used for Edge AI solutions [118],
with additional connectivity include Jetson Nano [87], Jetson [119], e.g., the STM32L4 Discovery kit IoT node provides a

---------------------------------------------------------------------------- 323 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

development board for IoT [120]. Hailo offers its AI processor PyTorch API. PyTorch Mobile can also target only mobile
via M.2 and PCIe bus. There are also two evaluation boards devices running Android and iOS, while TensorFlow Lite can
available [51]. Other possible Edge AI hardware vendors in- also target embedded systems.
clude Adlink [121], Blaize [122], Aetina [123] and ARM with The study by Alqahtani et al. [130] concluded that quan-
the Ethos-U65 [124]. Another popular platform is the Rasp- tization is the most effective method for model optimization.
berry Pi, for example the newest model Raspberry Pi 4 [125]. In quantization optimization, the 32-bit floating point model
parameters are converted to lower precision integers, which
I. Device tables allows the devices to use more computationally efficient inte-
Tables I and II list the devices and their basic specifications: ger math operations and the model storage requirements are
central processing unit (CPU) and possible graphics processing reduced. Both, TensorFlow and PyTorch, have documentation
unit (GPU), neural processing unit (NPU), memory and type of and easy to use support for model quantization, although
the device. The NPU can also be a digital signal processing PyTorch API is still in beta.
(DSP) unit. Maximum indicated RAM is also reported. Not In addition to quantization, TensorFlow also supports model
all details were relevant for the device, available, or they pruning and weight clustering. In model pruning, the model
were too ambiguous, so this information is indicated by a weights are sparsified so that the model compresses better.
dash (–). We follow the hardware taxonomy proposed by TensorFlow pruning method is explained in detail in [133].
Li and Liewig [11], but we have extended the system-on- Weight clustering has the same goal with better model com-
module to indicate the connection type. We have also marked pression. The short explanation of weight clustering is that
server devices as a separate category. Thus, the categories are: the layer weights are clustered to N clusters and only the
system-on-a-chip (SoC), system-on-module (SOM), single- centroid of every cluster is saved. Weight clustering is ex-
board computer (SBC) and server. SOM connection types plained in [134]. PyTorch also supports pruning, but it does
include external universal serial bus device (USB), external not currently have built in support for weight clustering.
M.2 card slot device (M.2), PCIe slot device (PCIe).
B. Edge AI software on mobile devices
V. E DGE AI SOFTWARE Mobile devices that have specialised hardware for neu-
This section lists and explains software projects and tools ral network acceleration expose the hardware for developers
that are useful in the context of Edge AI. The section is through application programming interfaces (API). On devices
subdivided in the following way. Subsection V-A lists the that use Android operating system, the API is called Android
software that is generally useful for preparing neural network Neural Networks API (NNAPI) [135]. On Apple operating
models for running on resource constrained devices. Subsec- systems the API is called Core ML [136]. The Android NNAPI
tion V-B lists the software that is useful when deploying is not designed to be used directly by the developers. Instead,
machine learning models to modern smart phones and other it is intented to be used through some higher-level API like
powerful mobile devices. Subsection V-C lists the software TensorFlow Lite. The TensorFlow Lite neural network models
that is useful when the target is a microcontroller. are executed on the specialised hardware with TensorFlow Lite
NNAPI delegate [137]. PyTorch also has NNAPI support, but
A. Neural network model optimization it is currently still in beta and not very well documented [138].
When speaking about deep learning, the most popular NNAPI only supports inference on the device, so it cannot be
frameworks are TensorFlow [126], Keras [127] (high level used for on-device learning. The Apple Core ML framework
TensorFlow API) and PyTorch [128] according to Kaggle 2021 also supports both TensorFlow and Pytorch through its Unified
survey [129]. These frameworks are optimized at running on Conversion API [139], and in addition to that, Apple also has
GPUs and other specialised hardware that accelerate the model the Create ML framework [140]. It is an easy-to-use interface
training process. After the model has been trained, inference is for developers to create machine learning models that work
not as computationally expensive operation, but even that re- with Core ML. In comparison to NNAPI another difference
quires moderate amounts of memory and computation power. is also that Core ML supports on-device training that can be
Devices running on the edge are often resource constrained used to personalise a model to user’s needs on-device.
on that front. There exist techniques that can be used on deep In addition to Android NNAPI hardware acceleration API,
learning models that reduce the model complexity, memory many vendors have their own software development kits
and computation requirements with little or no affect to the (SDK) for running hardware accelerated models on their
model accuracy [130]. Both, TensorFlow and PyTorch, have systems. Qualcomm has the Qualcomm Neural Processing
some of these methods built in to the frameworks that can be SDK for AI product [141], Huawei has the HUAWEI HiAI
used to optimize the model performance on low power devices. foundation product [142], Mediatek has the Mediatek Neu-
TensorFlow has collected the tools and documentation about ropilot product [143], and Samsung has the Samsung Neural
this topic under TensorFlow Lite subproject [131]. PyTorch SDK product [144] although Samsung does no longer provide
has a somewhat similar situation with PyTorch Mobile [132], the SDK to third party developers. Ignatov et al. have a good
except that PyTorch Mobile is more of a workflow than a section about the vendor specific SDKs in [145], [146]. The
proper subproject and the tooling is included in the main problem with vendor specific SDKs is that the model created

---------------------------------------------------------------------------- 324 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

TABLE I. HARDWARE DEVICE


SPECIFICATIONS

Device C/GPU NPU memory type

Movidius Myriad X [39] 16-core CPU 700 MHz Neural Compute Engine 2.5 MB SoC
Neural Compute Stick 2 [57] 16-core CPU 700 MHz Neural Compute Engine 2.5 MB USB
Vision Accelerator [58] CPU Neural Compute Engine 4 GB M.2/PCIe
UP Squared 6000 [59] 4-core CPU 2.0 GHz, GPU Neural Compute Engine 64 GB SBC
Kirin 970 [62] 8-core CPU, 12-core GPU Dedicated NPU – SoC
Snapdragon 855+/860 [63] 8-core CPU 2.96 GHz, GPU DSP 16 GB SoC
RK1808 [73] 2-core CPU 1.6 GHz NPU (DDR) SoC
RK3399Pro [74] 6-core CPU NPU (DDR) SoC
Helio P90 [64] 8-core CPU 2.2 GHz, GPU APU 2.0 8 GB SoC
Helio P95 [69] 8-core CPU, GPU APU 2.0 8 GB SoC
i300a [65] 4-core CPU 1.5 GHz, GPU – (DDR) SoC
i300b [66] 4-core CPU 1.3 GHz – 3 GB SoC
i350 [67] 4-core CPU 2.0 GHz, GPU APU 1.0 (DDR) SoC
i500 [68] 8-core CPU 2.0 GHz, GPU APU 2-core 500 MHz (DDR) SoC
Dimensity 1000 [70] 8-core CPU, GPU APU 3.0 16 GB SoC
Dimensity 9000 [71] 8-core CPU, GPU APU 590 (DDR) SoC
Beagle Bone AI [114] 2-core CPU 1.5 GHz, GPU 2x DSP, 2x EVE 1 GB SBC
Coral Accelerator Module [42] – Edge TPU – SoC
Coral USB Accelerator [78] – Edge TPU – USB
Coral M.2 Accelerator [79] – Edge TPU – M.2
Coral M.2 Accelerator with Dual Edge TPU [80] – 2x Edge TPUs – M.2
Coral Mini PCIe Accelerator [81] – Edge TPU – PCIe
Coral Dev Board Mini [82] 4-core CPU 1.5 GHz, GPU Edge TPU 2 GB SBC
Coral System-on-Module [83] 4-core CPU 1.5 GHz, GPU Edge TPU 4 GB SOM
Coral Dev Board [84] 4-core CPU 1.5 GHz, GPU Edge TPU 4 GB SBC
JeVois-A33 [76] 4-core CPU 1.34 GHz, GPU – 256 MB SOM
JeVois Pro [77] 6-core CPU, GPU Neural Processing Unit 4 GB SOM
Lightspeeur 2801S Neural accelerator [44] 100 MHz MPE – SoC
Lightspeeur 5801S Neural accelerator [104] 200 MHz MPE – SoC
Lightspeeur 2803S Neural accelerator [105] 250 MHz MPE – SoC
Lacelli Edge Inferencing Server [106] 32-core CPU 4x MPE 32x 8 GB server
Gainboard 2801 [107] – MPE – PCIe
Gainboard 2803 [108] – MPE – PCIe
Janux G31 AI Server [109] 16-core CPU 32x MPE – server
M1076 [45] – ACE – SoC
MP10304 Quad-AMP PCIe Card [110] – 4x ACE – PCIe
MM1076 M.2 M [111] – ACE – M.2
ME1076 M.2 A+E [112] – ACE – M.2
MNS1076 AMP [113] – ACE – SBC
Kendryte K210 [75] 2-core – – SoC
OpenMV Cam [115] CPU 480 MHz – 1 MB SOM
SparkFun Edge Dev Apollo3 Blue [116] CPU 48 MHz – 384 kB SBC
Syntiant Dev Board [117] CPU 48 MHz NDP101 32 kB SBC
STM32L4 [120] CPU 80 MHz – 128 kB SBC
Hailo-8 [51] – Hailo-8 – SoC
Hailo-8 M.2 [51] – Hailo-8 – SOM
Hailo-8 Mini PCIe [51] – Hailo-8 – SOM
Hailo-8 Century Evaluation Platform [51] – Hailo-8 – PCIe
Hailo-8 Evaluation Board [51] – Hailo-8 – SOM
Raspberry Pi 4 [125] 4-core CPU 1.5 GHz – 8 GB SBC

---------------------------------------------------------------------------- 325 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

TABLE II. JETSON HARDWARE DEVICE


SPECIFICATIONS

device C/GPU NPU memory type

Jetson Nano [87] 4-core CPU, GPU – 4 GB SOM


Jetson TX2 NX [88] 6-core CPUs, GPU – 4 GB SOM
Jetson TX2 4GB [89] 6-core CPUs, GPU – 4 GB SOM
Jetson TX2 [90] 6-core CPUs, GPU – 8 GB SOM
Jetson TX2i [91] 6-core CPUs, GPU – 8 GB SOM
Jetson Xavier NX 16GB [92] 6-core CPU, GPU 2x NVDLA v1, 2x PVA v1 16 GB PCIe
Jetson Xavier NX [93] 6-core CPU, GPU 2x NVDLA v1, 2x PVA v1 8 GB PCIe
Jetson AGX Xavier 64GB [94] 8-core CPU, GPU 2x NVDLA v1, 2x PVA v1 64 GB SOM
Jetson AGX Xavier [95] 8-core CPU, GPU 2x NVDLA v1, 2x PVA v1 32 GB SOM
Jetson AGX Xavier Industrial [96] 8-core CPU, GPU 2x NVDLA v1, 2x PVA v1 32 GB SOM
Jetson Orin NX [97] 8-core CPU 2.0 GHz, GPU 2x NVDLA v2, PVA v2 12 GB SOM
Jetson AGX Orin [98] 12-core CPU 2.0 GHz, GPU 2x NVDLA v2, PVA v2 32 GB SOM
Jetson Nano Developer Kit [99] 4-core CPU 1.42 GHz, GPU – 4 GB SBC
Jetson Nano 2GB Developer Kit [100] 4-core CPU1.43 GHz, GPU – 2 GB SBC
Jetson Nano Xavier NX Developer Kit [101] 6-core CPU, GPU 2x NVDLA, PVA 8 GB SBC
Jetson AGX Xavier Developer Kit [102] 8-core CPU, GPU 2x NVDLA, PVA 32 GB SBC
Jetson AGX Orin Developer Kit [103] 12-core CPU, GPU 2x NVDLA v2.0, PVA 2.0 32 GB SBC

with one SDK can run only in devices that the vendor specific Lite for microcontrollers is. The project does not support the
SDK supports. For that reason it is better to use the more ONNX-ML extension of the format that has support for other
generic NNAPI interface if possible. machine learning algorithms not based on neural networks.
Somewhat similar to deepC project are the deep learning
C. Edge AI software for microcontrollers compiler projects Glow [151], ONNC [152], TVM [153]
Even when a device does not have specialized hardware and openVINO [154]. None of these tools are specifically
for model execution acceleration, the model can always be made for compiling code to microcontroller targets, but many
executed on CPU. This means that neural network model of them support microcontroller chips. Sponner et al. have
inference can be performed on-device even on the tiniest done a good review and a benchmark about these tools
microcontrollers if the model size is small enough to fit into targeting embedded platforms in [155]. From these tools the
memory. The problem with microcontrollers is that very often TVM project is probably the most interesting. It does not
they are not running the operating system that projects such only contain a compiler for compiling the models to target
as TensorFlow Lite depend on. This can be solved with Ten- platforms, but it also contains an auto-tuning feature that
sorFlow Lite for Microcontrollers library [147]. The project tests different compilation optimizations on the target platform
was created by merging the uTensor project into the main to find more optimal compilation results. The project also
TensorFlow project [148]. The library is written in C++11 and includes microTVM subproject specially made for compiling
works on any 32-bit platform. The model data is stored as a C models to bare metal microcontroller targets, although the
array to read-only program memory on the device where the documentation includes a disclaimer that the project is still
library can read it. Thus, the library does not need an operating under heavy development.
system or a file system for model creation and inference. Probably not an exhaustive list, but other libraries and
Similar to TensorFlow Lite for microcontrollers is the toolkits for converting neural network models to microcon-
deepC project [149]. The project has the same goal of getting trollers are Neural Network on Microcontroller (NNoM) [156],
neural network models to work on microcontrollers, but the X-CUBE-AI [157], e-AI [158], eIQ [159], nncase [160],
approach is very different. The project includes a compiler NNCG [161] and Embedded Learning Library (ELL) [162].
that compiles neural network models directly to C++ code From these, the NNoM project is the most similar when
that can then be included in the actual project that uses the compared to TensorFlow Lite for Microcontrollers library and
model. All neural network models that can be stored in the the deepC project. It is vendor independent, but it supports
basic neural network variant of the Open Neural Network only models that are created using Keras. The project includes
Exchange (ONNX) format [150] can be compiled with the a compiler that compiles the Keras code to pure C code. If
deepC compiler. The ONNX format has good support for the target platform is ARM Cortex-M processor, the compiler
every major deep learning framework, so the project can be can generate optimized code by utilizing ARM CMSIS NN
used with a variety of different models and is not restricted Software Library [163], [164]. The ARM CMSIS NN library
to models created by TensorFlow Lite like the TensorFlow includes optimized versions of the functions that are often used

---------------------------------------------------------------------------- 326 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

in neural network models, but it does not include the automatic weka-porter, the other conversion projects convert scikit-learn
conversion tool from other deep learning frameworks, so the models to C or C++ code. Support for different models varies.
conversion step would be manual without a tool like NNoM. sklearn-porter and m2cgen can also convert the model to some
The X-CUBE-AI project is an extension package from other programming language such as Javascript or Java. The
STMicroelectronics to their STM32CubeMX product. weka-porter project supports only WEKA [177] decision tree
STM32CubeMX is a graphical user interface that allows conversions, and the Micro-LM project supports only models
users to create configuration and initialization code to trained with the Desk-LM module [174], although the Desk-
STM32 microcontrollers [165]. The extension package LM module in turn depends on and uses scikit-learn library.
supports pretrained machine learning models that are made
with TensorFlow Lite, or that are exported to the ONNX VI. C ONCLUSION
standard from some other framework. It outputs an optimized The Edge AI ecosystem is still in its infancy. Various
code library that works on STM32 microcontrollers. The products and services are offered, but many of them are placed
e-AI project from Renesas is similar to this. Instead of under the umbrella term for marketing purposes. However, the
targeting STM32, the tool generates code for Renesas own development of Edge AI as a discipline of its own is evident.
microcontroller families. It supports deep learning models Hardware ranges from special-purpose processors and AI
made with TensorFlow, Pytorch or TensorFlow Lite. The accelerators to full servers. For many users, the various
third tool in the same class of vendor specific tools is eIQ system-on-chip solutions can be useful for the final product.
by NXP Semiconductors, supporting TensorFlow and ONNX On the other hand, the various development boards and single-
input formats. The compilation target is more modular as the board computers provide a good starting point and prototyping
tool supports more inference engines. It can use TensorFlow possibilities. In addition, the USB, M.2 and PCIe bus devices
Lite, Glow, ARM CMSIS-NN or DeepViewRT [166] to run bring the power of AI acceleration to other devices.
the model on the target platform. The last vendor specific Both of the most popular deep learning frameworks, Tensor-
tool is nncase. The generated code targets Kendryte K210 or Flow and PyTorch, can be used to do Edge AI. Between them,
K510 chips. It supports TensorFlow Lite and ONNX formats. TensorFlow is more suitable for Edge AI purposes. The frame-
From the last two neural network conversion tools listed, work includes better documentation and more out of the box
NNCG is more of a research project and the authors discourage methods for model optimization than PyTorch. TensorFlow
using the tool in production. The project is very similar when also supports microcontroller targets with the TensorFlow Lite
compared to NNoM. It converts Keras models to C code. for microcontrollers subproject while PyTorch only supports
The last listed tool, the Embedded Learning Library (ELL) mobile device operating system targets.
project is made by Microsoft. It is work in progress, and
Between Android and Apple mobile device operating sys-
the authors warn about unexpected API changes. The project
tems support for AI acceleration on hardware, Apple maybe
documentation is also lacking with only few tutorials about
has a better edge by supporting on-device training and having
deploying machine learning models to Raspberry Pi single
the Create ML framework for creating AI models in addition
board computers. The project repository commit history shows
of supporting all of the most popular AI frameworks.
that the project has received only few updates in recent years,
Edge AI for microcontrollers comes with the most software
so the project might be obsolete.
offerings. The workflow of getting AI models running on
Table III summarizes the vendor neutral open-source deep
microcontroller hardware has not yet found a best practice that
learning model compilers and converters in a table format for
everyone uses. There seem to be three competing approaches:
easier comparison.
The previously listed tools are made for getting neural 1) Using a runtime that loads the model data from read-
network inference to work with microcontrollers. In addition only device memory at runtime
to them, the more traditional machine learning models can 2) Using transcompiler that compiles model to C or C++
also be used to do inference on the edge devices. Often code that then can be used in the project
the traditional models are not computationally as demanding 3) Using a compiler that compiles the model to a library
as neural networks, but the problem is that very often the that is statically or dynamically linked to the project
models are created using some Python based framework The good thing is that many of these projects support the
like scikit-learn [167]. It is possible to run Python code on ONNX model format, which could mean that benchmarking
microcontrollers with a project like MicroPython [168], but the different projects with the same model might be easier.
this creates unnecessary overhead for code execution. It is In the future, a standardized software API to access hard-
better to convert the model to more efficient machine code ware acceleration could offer a more productive develop-
to save as much as possible of the limited resources that ment experience. Standardized software workflows or, at least,
the microcontrollers have. This is probably not an exhaustive commonly accepted reference specifications would be highly
list, but some of the existing projects to do this conver- useful. Software terminology needs unification across research
sion are: sklearn-porter [169], emlearn [170], m2cgen [171], and vendors. Furthermore, security considerations should be
EmbML [172], micromlgen [173], Micro-LM [174], micro- studied further, as many of the Edge AI solutions could suffer
learn [175] and weka-porter [176]. Except for Micro-LM and from the same vulnerabilities as common IoT systems.

---------------------------------------------------------------------------- 327 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

TABLE III. OPEN-SOURCE DEEP LEARNING MODEL


COMPILERS

Project License Supported models Tool output Platform support requirements

TensorFlow lite
Apache-2.0 TensorFlow TensorFlow lite flat buffer C++ compiler
for microctonrollers [147]

deepC [149] Apache-2.0 ONNX C++ code C++ compiler


ONNX, Caffe2, Compiled library bundle
Glow [151]* Apache-2.0 LLVM support
TensorFlow Lite (object, header and weight files)

ONNC [152] BSD-3-Clause ONNX C code and binary weight files C compiler
TensorFlow, TensorFlow Lite,
C code or compiled object file,
Keras, PyTorch, ONNX, C compiler and
TVM [153] (microTVM) Apache-2.0 Graph JSON file and
Core ML, caffe2, mxnet, standard library
Parameter file
PaddlePaddle

NNoM [156] Apache-2.0 Keras C code C compiler

* Ahead-of-time compilation mode

ACKNOWLEDGMENT [10] A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and


J. Kepner, “Survey of machine learning accelerators,” in 2020 IEEE
This research was funded by the Regional Council of High Performance Extreme Computing Conference (HPEC), 2020, pp.
Central Finland/Council of Tampere Region and European 1–12.
Regional Development Fund as part of the Data for Utilisation [11] W. Li and M. Liewig, “A survey of AI accelerators for edge en-
vironment,” in Trends and Innovations in Information Systems and
– Leveraging digitalisation through modern artificial intelli- Technologies. WorldCIST 2020, ser. Advances in Intelligent Systems
gence solutions and cybersecurity and coADDVA - ADDing and Computing, Á. Rocha, H. Adeli, L. Reis, S. Costanzo, I. Orovic,
VAlue by Computing in Manufacturing projects of JAMK and F. Moreira, Eds. Cham: Springer, 2020, vol. 1160, pp. 35–44.
University of Applied Sciences. [12] X. Crespo. (2022) AI at the edge. [Online]. Available: https:
//github.com/crespum/edge-ai
The authors would like to thank Ms. Tuula Kotikoski for [13] M. Merenda, C. Porcaro, and D. Iero, “Edge machine learning for
proofreading the manuscript. ai-enabled iot devices: A review,” Sensors, vol. 20, no. 9, 2020.
[Online]. Available: https://www.mdpi.com/1424-8220/20/9/2533
R EFERENCES [14] P. P. Ray, “A review on tinyml: State-of-the-art and prospects,” Journal
[1] Y.-L. Lee, P.-K. Tsung, and M. Wu, “Techology trend of edge AI,” in of King Saud University - Computer and Information Sciences, 2021.
2018 International Symposium on VLSI Design, Automation and Test [Online]. Available: https://www.sciencedirect.com/science/article/pii/
(VLSI-DAT), 2018, pp. 1–2. S1319157821003335
[2] S. Yi, Z. Hao, Z. Qin, and Q. Li, “Fog computing: Platform and [15] D. Levac, H. Colquhoun, and K. K. O’Brien, “Scoping studies:
applications,” in 2015 Third IEEE Workshop on Hot Topics in Web advancing the methodology,” Implementation Science, vol. 5, 2010.
Systems and Technologies (HotWeb), 2015, pp. 73–78. [16] H. Arksey and L. O’Malley, “Scoping studies: towards a methodologi-
[3] Y. Shi, K. Yang, T. Jiang, J. Zhang, and K. B. Letaief, “Communication- cal framework,” International Journal of Social Research Methodology,
efficient edgeAI: Algorithms and systems,” IEEE Communications vol. 8, pp. 19–32, 2005.
Surveys Tutorials, vol. 22, no. 4, pp. 2167–2191, 2020. [17] Hattori, J. Doucette, R. Puntaier, and cppRohit.
[4] K. Bhardwaj, N. Suda, and R. Marculescu, “EdgeAI: A vision for deep (2021) How to embed/deploy an arbitrary machine
learning in the IoT era,” IEEE Design Test, vol. 38, no. 4, pp. 37–43, learning model on microcontrollers? [Online]. Avail-
2021. able: https://ai.stackexchange.com/questions/25775/how-to-embed-
[5] X. Lin, J. Li, J. Wu, H. Liang, and W. Yang, “Making knowledge deploy-an-arbitrary-machine-learning-model-on-microcontrollers
tradable in edge-ai enabled iot: A consortium blockchain-based efficient [18] J. Zhang, M. Z. A. Bhuiyan, X. Yang, T. Wang, X. Xu, T. Hayajneh,
and incentive approach,” IEEE Transactions on Industrial Informatics, and F. Khan, “Anticoncealer: Reliable detection of adversary concealed
vol. 15, no. 12, pp. 6367–6378, 2019. behaviors in edgeai assisted iot,” IEEE Internet of Things Journal, pp.
[6] R. Sachdev, “Towards security and privacy for edge AI in IoT/IoE 1–1, 2021.
based digital marketing environments,” in 2020 Fifth International
[19] R. E. Ogu, C. I. Ikerionwu, and I. I. Ayogu, “Leveraging artificial
Conference on Fog and Mobile Edge Computing (FMEC), 2020, pp.
intelligence of things for anomaly detection in advanced metering
341–346.
infrastructures,” in 2020 IEEE 2nd International Conference on Cy-
[7] H. H. Kumar, K. V R, and M. K. Nair, “Federated k-means clustering:
berspac (CYBER NIGERIA), 2021, pp. 16–20.
A novel edge AI based approach for privacy preservation,” in 2020
IEEE International Conference on Cloud Computing in Emerging [20] A. Nawaz, T. N. Gia, J. P. Queralta, and T. Westerlund, “EdgeAI and
Markets (CCEM), 2020, pp. 52–56. blockchain for privacy-critical and data-sensitive applications,” in 2019
[8] X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan, and X. Chen, Twelfth International Conference on Mobile Computing and Ubiquitous
“Convergence of edge computing and deep learning: A comprehensive Network (ICMU), 2019, pp. 1–2.
survey,” IEEE Communications Surveys Tutorials, vol. 22, no. 2, pp. [21] J. Zhang, M. Z. A. Bhuiyan, X. Yang, A. K. Singh, D. F. Hsu,
869–904, 2020. and E. Luo, “Trustworthy target tracking with collaborative deep
[9] S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, and A. Y. Zomaya, reinforcement learning in edgeai-aided iot,” IEEE Transactions on
“Edge intelligence: The confluence of edge computing and artificial Industrial Informatics, vol. 18, no. 2, pp. 1301–1309, 2022.
intelligence,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7457– [22] J. Sanghvi, P. Bhattacharya, S. Tanwar, R. Gupta, N. Kumar, and
7469, 2020. M. Guizani, “Res6edge: An edge-ai enabled resource sharing scheme

---------------------------------------------------------------------------- 328 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

for c-v2x communications towards 6g,” in 2021 International Wireless [41] ——. (2022) AI for smartphones. [Online]. Available: https:
Communications and Mobile Computing (IWCMC), 2021, pp. 149–154. //www.mediatek.com/innovations/ai-for-smartphones
[23] G. Mathew, S. Sindhu Ramachandran, and S. V.S., “Edgeai: Diabetic [42] Google LLC. (2020) Accelerator Module. [Online]. Available:
retinopathy detection in intel architecture,” in 2020 IEEE / ITU https://coral.ai/docs/module/datasheet/
International Conference on Artificial Intelligence for Good (AI4G), [43] NVIDIA Corporation. (2022) NVDLA Primer. [Online]. Available:
2020, pp. 75–80. http://nvdla.org/primer.html
[24] J. P. Queralta, T. N. Gia, H. Tenhunen, and T. Westerlund, “Edge- [44] Gyrfalcon Technology Inc. (2022) Lightspeeur 2801s neural accelera-
AI in LoRa-based health monitoring: Fall detection system with fog tor. [Online]. Available: https://www.gyrfalcontech.ai/solutions/2801s
computing and LSTM recurrent neural networks,” in 2019 42nd In- [45] Mythic. (2022) M1076 Analog Matrix Processor. [Online]. Available:
ternational Conference on Telecommunications and Signal Processing https://www.mythic-ai.com/product/m1076-analog-matrix-processor/
(TSP), 2019, pp. 601–604. [46] Syntiant Corp. (2022) Ndp100 neural decision processor. [Online].
[25] S. O. Ooko, D. Mukanyiligira, J. P. Munyampundu, and J. Nsenga, Available: https://www.syntiant.com/ndp100
“Edge AI-based respiratory disease recognition from exhaled breath [47] ——. (2022) Ndp101 neural decision processor. [Online]. Available:
signatures,” in 2021 IEEE Jordan International Joint Conference on https://www.syntiant.com/ndp101
Electrical Engineering and Information Technology (JEEIT), 2021, pp. [48] ——. (2022) Ndp102 neural decision processor. [Online]. Available:
89–94. https://www.syntiant.com/ndp102
[26] ——, “Synthetic exhaled breath data-based edgeAI model for the [49] ——. (2022) Ndp120 neural decision processor. [Online]. Available:
prediction of chronic obstructive pulmonary disease,” in 2021 Inter- https://www.syntiant.com/ndp120
national Conference on Computing and Communications Applications [50] ——. (2022) Ndp200 neural decision processor. [Online]. Available:
and Technologies (I3CAT), 2021, pp. 1–6. https://www.syntiant.com/ndp200
[27] F.-J. Shen, J.-H. Chen, W.-Y. Wang, D.-L. Tsai, L.-C. Shen, and C.-T. [51] Hailo. (2022) The world’s top performing AI processor for edge
Tseng, “A CNN-based human head detection algorithm implemented devices. [Online]. Available: https://hailo.ai/
on EdgeAI chip,” in 2020 International Conference on System Science [52] Intel Corporation. (2022) MAX® V CPLDs. [Online]. Available: https:
and Engineering (ICSSE), 2020, pp. 1–5. //www.intel.com/content/www/us/en/products/details/fpga/max/v.html
[28] C. Gamanayake, L. Jayasinghe, B. K. K. Ng, and C. Yuen, “Cluster [53] ——. (2022) Intel® Cyclone® 10 LP FPGA. [Online].
pruning: An efficient filter pruning method for edgeAI vision applica- Available: https://www.intel.com/content/www/us/en/products/details/
tions,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, fpga/cyclone/10/lp.html
no. 4, pp. 802–816, 2020. [54] ——. (2022) Intel® Cyclone® 10 GX FPGA. [Online].
[29] R. Rawat, S. Gupta, S. Mohapatra, S. P. Mishra, and S. Rajagopal, Available: https://www.intel.com/content/www/us/en/products/details/
“Intelligent acoustic module for autonomous vehicles using fast gated fpga/cyclone/10/gx.html
recurrent approach,” in 2021 4th International Conference on Recent [55] K. Hagiwara, T. Hayashi, S. Kawasaki, F. Arakawa, O. Endo, H. No-
Developments in Control, Automation Power Engineering (RDCAPE), mura, A. Tsukamoto, D. Nguyen, B. Nguyen, A. Tran, H. Hyunh,
2021, pp. 345–350. I. Kudoh, and C.-K. Pham, “A two-stage-pipeline CPU of SH-2
[30] R. Miyata, O. Fukuda, N. Yamaguchi, and H. Okumura, “Object search architecture implemented on FPGA and SoC for IoT, edge AI and
using Edge-AI based mobile robot,” in 2021 6th International Confer- robotic applications,” in 2018 IEEE Symposium in Low-Power and
ence on Intelligent Informatics and Biomedical Sciences (ICIIBMS), High-Speed Chips (COOL CHIPS), 2018, pp. 1–3.
vol. 6, 2021, pp. 198–203. [56] K. Karras, E. Pallis, G. Mastorakis, Y. Nikoloudakis, J. M. Batalla,
[31] V. Mazzia, A. Khaliq, F. Salvetti, and M. Chiaberge, “Real-time apple C. X. Mavromoustakis, and E. K. Markakis, “A hardware acceleration
detection system using embedded systems with hardware accelerators: platform for AI-based inference at the edge,” Circuits Syst. Signal
An edge AI application,” IEEE Access, vol. 8, pp. 9102–9114, 2020. Process., vol. 39, no. 2, pp. 1059–1070, 2020.
[32] T.-C. Chen, W.-T. Wang, K. Kao, C.-L. Yu, C. Lin, S.-H. Chang, and [57] Intel Corporation. (2022) Intel Neural Compute Stick 2 (Intel
P.-K. Tsung, “Neuropilot: A cross-platform framework for edge-ai,” in NCS2). [Online]. Available: https://www.intel.com/content/www/us/
2019 IEEE International Conference on Artificial Intelligence Circuits en/developer/tools/neural-compute-stick/overview.html
and Systems (AICAS), 2019, pp. 167–170. [58] ——. (2022) Intel vision accelerator. [Online].
[33] H. Rexha and S. Lafond, “Data collection and utilization framework for Available: https://www.intel.com/content/www/us/en/developer/topic-
edge AI applications,” in 1st IEEE/ACM Workshop on AI Engineering - technology/edge-5g/hardware/vision-accelerator-movidius-vpu.html
Software Engineering for AI, WAIN@ICSE 2021, Madrid, Spain, May [59] Aaeon Europe BV. (2022) UP Squared 6000 Edge Computing Kit.
30-31, 2021. IEEE, 2021, pp. 105–108. [Online]. Available: https://up-board.org/up-squared-6000/
[34] S. P. Baller, A. Jindal, M. Chadha, and M. Gerndt, “DeepEdgeBench: [60] Luxonis. (2022) DepthAI. [Online]. Available: https://
Benchmarking deep neural networks on edge devices,” in 2021 IEEE www.crowdsupply.com/luxonis/depthai
International Conference on Cloud Engineering (IC2E), 2021, pp. 20– [61] ——. (2022) megaAI. [Online]. Available: https://
30. www.crowdsupply.com/luxonis/megaai
[35] M. Suri, A. Gupta, V. Parmar, and K. H. Lee, “Performance enhance- [62] HiSilicon (Shanghai) Technologies Co., Ltd. (2022) Kirin
ment of edge-AI-inference using commodity MRAM: IoT case study,” 970. [Online]. Available: https://www.hisilicon.com/en/products/Kirin/
in 2019 IEEE 11th International Memory Workshop (IMW), 2019, pp. Kirin-flagship-chips/Kirin-970
1–4. [63] Qualcomm Technologies, Inc. (2022) Snapdragon 855+/860 mobile
[36] V. Parmar, M. Suri, K. Yamane, T. Lee, N. L. Chung, and V. B. Naik, platform. [Online]. Available: https://www.qualcomm.com/products/
“MRAM-based BER resilient quantized edge-AI networks for harsh snapdragon-855-plus-and-860-mobile-platform
industrial conditions,” in 2021 IEEE 3rd International Conference on [64] MediaTek Inc. (2022) MediaTek Helio P90. [Online]. Available:
Artificial Intelligence Circuits and Systems (AICAS), 2021, pp. 1–4. https://www.mediatek.com/products/mediatek-helio-p90
[37] A. Levisse, M. Rios, W.-A. Simon, P.-E. Gaillardon, and D. Atienza, [65] ——. (2022) i300a (mt8362a). [Online]. Available: https:
“Functionality enhanced memories for edge-AI embedded systems,” //www.mediatek.com/products/products/aiot/mt8362a
in 2019 19th Non-Volatile Memory Technology Symposium (NVMTS), [66] ——. (2022) i300b (mt8362b). [Online]. Available: https:
2019, pp. 1–4. //www.mediatek.com/products/products/aiot/mt8362b
[38] D. Liu, X. Chen, Z. Zhou, and Q. Ling, “HierTrain: Fast hierarchical [67] ——. (2022) i350. [Online]. Available: https://www.mediatek.com/
edge AI learning with hybrid parallelism in mobile-edge-cloud com- products/products/aiot/i350-mt8365
puting,” IEEE Open Journal of the Communications Society, vol. 1, [68] ——. (2022) i500 (mt8385). [Online]. Available: https:
pp. 634–645, 2020. //www.mediatek.com/products/products/aiot/i500
[39] Intel Corporation. (2022) Intel® Movidius™ Myr- [69] ——. (2022) MediaTek Helio P95. [Online]. Avail-
iad™ X Vision Processing Unit. [Online]. able: https://www.mediatek.com/products/products/smartphones-2/
Available: https://www.intel.com/content/www/us/en/products/details/ mediatek-helio-p95
processors/movidius-vpu/movidius-myriad-x.html [70] ——. (2022) MediaTek Dimensity 1000 Series. [Online].
[40] MediaTek Inc. (2022) Artificial intelligence. [Online]. Available: Available: https://www.mediatek.com/products/products/smartphones-
https://www.mediatek.com/innovations/artificial-intelligence 2/dimensity-1000-series

---------------------------------------------------------------------------- 329 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

[71] ——. (2022) MediaTek Dimensity 9000. [Online]. [106] ——. (2022) Lacelli edge inferencing server AI acceleration subsystem.
Available: https://www.mediatek.com/products/products/smartphones- [Online]. Available: https://www.gyrfalcontech.ai/solutions/lacelli-
2/mediatek-dimensity-9000 edge-inferencing-server/
[72] P.-K. Tsung, T.-C. Chen, C.-H. Lin, C.-Y. Chang, and J.-M. Hsu, “Het- [107] ——. (2022) Gainboard 2801 AI for the data center, private
erogeneous computing for edge AI,” in 2019 International Symposium & public cloud. [Online]. Available: https://www.gyrfalcontech.ai/
on VLSI Design, Automation and Test (VLSI-DAT), 2019, pp. 1–2. solutions/gainboard-2801s/
[73] Rockchip Electronics Co. (2022) RK1808. [Online]. Available: https:// [108] ——. (2022) Gainboard 2803. [Online]. Available: https:
www.rock-chips.com/a/en/products/RK18 Series/2019/0529/989.html //www.gyrfalcontech.ai/solutions/gainboard-2803s/
[74] ——. (2022) RK3399Pro. [Online]. Available: https://www.rock- [109] ——. (2022) Janux G31 AI server. [Online]. Available: https:
chips.com/a/en/products/RK33 Series/2018/0130/874.html //www.gyrfalcontech.ai/solutions/janux-inference-server/
[75] Canaan Inc. (2022) Kendryte k210. [Online]. Available: https: [110] Mythic. (2022) MP10304 Quad-AMP PCIe Card. [Online]. Available:
//canaan.io/product/kendryteai https://www.mythic-ai.com/product/mp10304-quad-amp-pcie-card/
[76] JeVois Smart Machine Vision. (2022) Jevois-a33. [Online]. Available: [111] ——. (2022) Mm1076 m.2 key card. [Online]. Available: https:
https://www.jevoisinc.com/pages/hardware //www.mythic-ai.com/product/mm1076/
[77] ——. (2022) Jevois-pro. [Online]. Available: https:// [112] ——. (2022) Me1076 m.2 a+e key card. [Online]. Available:
www.jevoisinc.com/products/jevois-pro-deep-learning-smart-camera https://www.mythic-ai.com/product/me1076/
[78] Google LLC. (2019) USB Accelerator datasheet. [Online]. Available: [113] ——. (2022) Mns1076 amp evaluation system. [Online]. Available:
https://coral.ai/docs/accelerator/datasheet/ https://www.mythic-ai.com/product/evaluation-system/
[79] ——. (2019) M.2 Accelerator. [Online]. Available: https://coral.ai/ [114] BeagleBoard.org Foundation. (2022) Beaglebone® AI. [Online].
docs/m2/datasheet/ Available: https://beagleboard.org/ai
[80] ——. (2020) M.2 Accelerator with Dual Edge TPU. [Online]. [115] OpenMV, LLC. (2022) OpenMV Cam H7 R2. [Online]. Available:
Available: https://coral.ai/docs/m2-dual-edgetpu/datasheet/ https://openmv.io/collections/cams/products/openmv-cam-h7-r2
[81] ——. (2019) Mini PCIe Accelerator. [Online]. Available: https:
[116] SparkFun Electronics. (2022) SparkFun Edge Development Board
//coral.ai/docs/mini-pcie/datasheet/
- Apollo3 Blue. [Online]. Available: https://www.sparkfun.com/
[82] ——. (2020) Dev Board Mini datasheet. [Online]. Available:
products/15170
https://coral.ai/docs/dev-board-mini/datasheet/
[117] Syntiant Corp. (2022) Syntiant tiny machine learning development
[83] ——. (2019) System-on-Module. [Online]. Available: https://coral.ai/
board. [Online]. Available: https://www.syntiant.com/tinyml
docs/som/datasheet/
[118] STMicroelectronics. (2021) Cartesiam. [Online]. Available: https:
[84] ——. (2020) Dev Board datasheet. [Online]. Available: https:
//cartesiam.ai/
//coral.ai/docs/dev-board/datasheet/
[85] ——. (2020) Camera. [Online]. Available: https://coral.ai/docs/camera/ [119] ——. (2022) Artificial intelligence ecosystem for STM32. [Online].
datasheet/ Available: https://www.st.com/content/st com/en/ecosystems/artificial-
[86] ——. (2019) Environmental sensor board. [Online]. Available: intelligence-ecosystem-stm32.html
https://coral.ai/docs/enviro-board/datasheet/ [120] ——. (2022) B-L475E-IOT01A – STM32L4 Discovery kit IoT node,
[87] NVIDIA Corporation. (2022) Jetson Nano. [Online]. Available: low-power wireless, BLE, NFC, SubGHz, Wi-Fi. [Online]. Available:
https://developer.nvidia.com/embedded/jetson-nano https://www.st.com/en/evaluation-tools/b-l475e-iot01a.html
[88] ——. (2022) Jetson TX2 NX Module. [Online]. Available: https: [121] ADLINK Technology Inc. (2021) Edge AI platforms. [Online].
//developer.nvidia.com/embedded/jetson-tx2-nx Available: https://www.adlinktech.com/en/Inference\ Platform
[89] ——. (2022) Jetson TX2 4GB Module. [Online]. Available: [122] Blaize. (2022) New AI edge computing – edge AI hardware &
https://developer.nvidia.com/embedded/jetson-tx2-4gb software. [Online]. Available: https://www.blaize.com/
[90] ——. (2022) Jetson TX2 Module. [Online]. Available: https: [123] Aetina Corporation. (2022) Industrial GPGPU & embedded edge
//developer.nvidia.com/embedded/jetson-tx2 AI computing solutions for critical imaging applications. [Online].
[91] ——. (2022) Jetson TX2i Module. [Online]. Available: https: Available: https://www.aetina.com/
//developer.nvidia.com/embedded/jetson-tx2i [124] Arm Limited. (2022) Ethos-U65 machine learning processor
[92] ——. (2022) Jetson Xavier NX 16GB. [Online]. Available: https: (NPU). [Online]. Available: https://www.arm.com/products/silicon-
//developer.nvidia.com/embedded/jetson-xavier-nx-16gb ip-cpu/ethos/ethos-u65
[93] ——. (2022) Jetson Xavier NX. [Online]. Available: https:// [125] R. P. Foundation. (2022) Raspberry Pi 4. [Online]. Available:
developer.nvidia.com/embedded/jetson-xavier-nx https://www.raspberrypi.com/products/raspberry-pi-4-model-b/
[94] ——. (2022) Jetson AGX Xavier 64GB. [Online]. Available: [126] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
https://developer.nvidia.com/embedded/jetson-agx-xavier-64gb G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat,
[95] ——. (2022) Jetson AGX Xavier. [Online]. Available: https: I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz,
//developer.nvidia.com/embedded/jetson-agx-xavier L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga,
[96] ——. (2022) Jetson AGX Xavier Industrial. [Online]. Available: S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
https://developer.nvidia.com/embedded/jetson-agx-xavier-i I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan,
[97] ——. (2022) Jetson Jetson Orin NX. [Online]. Available: https: F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke,
//developer.nvidia.com/embedded/jetson-orin-nx Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on
[98] ——. (2022) Jetson Jetson AGX Orin. [Online]. Available: https: heterogeneous systems,” 2015, software available from tensorflow.org.
//developer.nvidia.com/embedded/jetson-agx-ori [Online]. Available: https://www.tensorflow.org/
[99] ——. (2022) Jetson Nano Developer Kit. [Online]. Available: [127] F. Chollet et al., “Keras,” https://keras.io, 2015.
https://developer.nvidia.com/embedded/jetson-nano-developer-kit [128] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
[100] ——. (2022) Jetson Nano 2GB Developer Kit. [Online]. Available: T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf,
https://developer.nvidia.com/embedded/jetson-nano-2gb-developer-kit E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,
[101] ——. (2022) Jetson Xavier NX Developer Kit. [Online]. Available: L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style,
https://developer.nvidia.com/embedded/jetson-xavier-nx-devkit high-performance deep learning library,” in Advances in Neural
[102] ——. (2022) Jetson AGX Xavier Developer Kit. [Online]. Information Processing Systems 32, H. Wallach, H. Larochelle,
Available: https://developer.nvidia.com/embedded/jetson-agx-xavier- A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett,
developer-kit Eds. Curran Associates, Inc., 2019, pp. 8024–8035. [Online].
[103] ——. (2022) Jetson AGX Orin Developer Kit. [Online]. Available: Available: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-
https://developer.nvidia.com/embedded/jetson-agx-orin-developer-kit style-high-performance-deep-learning-library.pdf
[104] Gyrfalcon Technology Inc. (2022) Lightspeeur 5801s neural [129] Kaggle. (2021) State of data science and machine learning 2021.
accelerator. [Online]. Available: https://www.gyrfalcontech.ai/ [Online]. Available: https://www.kaggle.com/kaggle-survey-2021
solutions/lightspeeur-5801 [130] A. Alqahtani, X. Xie, and M. W. Jones, “Literature review of deep
[105] ——. (2022) Lightspeeur 2803s neural accelerator. [Online]. Available: network compression,” Informatics, vol. 8, no. 4, 2021. [Online].
https://www.gyrfalcontech.ai/solutions/2803s Available: https://www.mdpi.com/2227-9709/8/4/77

---------------------------------------------------------------------------- 330 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.
______________________________________________________PROCEEDING OF THE 31ST CONFERENCE OF FRUCT ASSOCIATION

[131] TensorFlow. (2022) TensorFlow Lite. [Online]. Available: https: [155] M. Sponner, B. Waschneck, and A. Kumar, “Compiler toolchains
//www.tensorflow.org/lite/ for deep learning workloads on embedded platforms,” in Research
[132] PyTorch. (2022) PyTorch Mobile. [Online]. Available: https:// Symposium on Tiny Machine Learning, 2021. [Online]. Available:
pytorch.org/mobile/home/ https://openreview.net/forum?id=O0kjwqJyhNd
[133] M. H. Zhu and S. Gupta, “To prune, or not to prune: Exploring the [156] J. Ma, “A higher-level Neural Network library on Microcontrollers
efficacy of pruning for model compression,” 2018. [Online]. Available: (NNoM),” Oct. 2020. [Online]. Available: https://doi.org/10.5281/
https://openreview.net/forum?id=S1lN69AT- zenodo.4158710
[134] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing [157] STMicroelectronics. (2022) X-CUBE-AI - AI expansion pack
deep neural network with pruning, trained quantization and huffman for STM32CubeMX. [Online]. Available: https://www.st.com/en/
coding,” in 4th International Conference on Learning Representations, embedded-software/x-cube-ai.html
ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference [158] Renesas Electronics Corporation. (2022) e-AI Solution. [Online].
Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2016. [Online]. Available: https://www.renesas.com/us/en/application/key-technology/
Available: http://arxiv.org/abs/1510.00149 artificial-intelligence/e-ai
[135] Android. (2022) Android Neural Networks API. [Online]. Available: [159] NXP Semiconductors. (2022) eIQ® ML Software Development
https://developer.android.com/ndk/guides/neuralnetworks/ Environment. [Online]. Available: https://www.nxp.com/design/
[136] Apple. (2022) Core ML Framework. [Online]. Available: https: software/development-software/eiq-ml-development-environment:EIQ
//developer.apple.com/documentation/coreml [160] Canaan Inc. (2022) nncase. [Online]. Available: https://github.com/
[137] Android. (2022) TensorFlow Lite NNAPI delegate. [Online]. Available: kendryte/nncase
https://www.tensorflow.org/lite/performance/nnapi [161] O. Urbann, S. Camphausen, A. Moos, I. Schwarz, S. Kerner, and
[138] PyTorch. (2022) (Beta) Convert MobileNetV2 to NNAPI. M. Otten, “A c code generator for fast inference and simple deployment
[Online]. Available: https://pytorch.org/tutorials/prototype/nnapi of convolutional neural networks on resource constrained systems,” in
mobilenetv2.html 2020 IEEE International IOT, Electronics and Mechatronics Confer-
[139] Apple. (2022) Unified Conversion API. [Online]. Available: https: ence (IEMTRONICS), 2020, pp. 1–7.
//coremltools.readme.io/docs/unified-conversion-api [162] M. Corporation. (2018) Embedded Learning Library (ELL). [Online].
[140] ——. (2022) Create ML Framework. [Online]. Available: https: Available: https://microsoft.github.io/ELL/
//developer.apple.com/machine-learning/create-ml/ [163] Arm Ltd. (2021) CMSIS NN Software Library. [Online]. Available:
[141] Qualcomm Technologies, Inc. (2022) Qualcomm Neural Processing https://www.keil.com/pack/doc/CMSIS/NN/html/index.html
SDK for AI. [Online]. Available: https://developer.qualcomm.com/ [164] L. Lai, N. Suda, and V. Chandra, “CMSIS-NN: efficient neural
software/qualcomm-neural-processing-sdk network kernels for arm cortex-m cpus,” CoRR, vol. abs/1801.06601,
2018. [Online]. Available: http://arxiv.org/abs/1801.06601
[142] HUAWEI. (2022) HUAWEI HiAI Foundation. [Online]. Available:
[165] STMicroelectronics. (2022) STM32CubeMX - STM32Cube
https://developer.huawei.com/consumer/en/hiai#Foundation
initialization code generator. [Online]. Available: https://www.st.com/
[143] MediaTek Inc. (2022) Mediatek Neuropilot. [Online]. Available:
en/development-tools/stm32cubemx.html
https://neuropilot.mediatek.com/
[166] Au-Zone Technologies. (2022) DeepViewRT™ Inference Engine.
[144] Samsung. (2022) Samsung Neural SDK. [Online]. Available: https: [Online]. Available: https://www.embeddedml.com/deepviewrt
//developer.samsung.com/neural/overview.html [167] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
[145] A. Ignatov, R. Timofte, W. Chou, K. Wang, M. Wu, T. Hartley, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
and L. Van Gool, “Ai benchmark: Running deep neural networks on plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
android smartphones,” in Proceedings of the European Conference on esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine
Computer Vision (ECCV) Workshops, September 2018. Learning Research, vol. 12, pp. 2825–2830, 2011.
[146] A. Ignatov, R. Timofte, A. Kulik, S. Yang, K. Wang, F. Baum, M. Wu, [168] George Robotics Limited. (2022) MicroPython. [Online]. Available:
L. Xu, and L. Van Gool, “Ai benchmark: All about deep learning on https://micropython.org/
smartphones in 2019,” in 2019 IEEE/CVF International Conference on [169] D. Morawiec, “sklearn-porter,” transpile trained scikit-learn estimators
Computer Vision Workshop (ICCVW), 2019, pp. 3617–3635. to C, Java, JavaScript and others. [Online]. Available: https:
[147] TensorFlow. (2021) TensorFlow Lite for Microcontrollers. [Online]. //github.com/nok/sklearn-porter
Available: https://www.tensorflow.org/lite/microcontrollers [170] J. Nordby, “emlearn: Machine Learning inference engine for
[148] Shelby, Zach. (2019) uTensor and Tensor Flow Announcement. Microcontrollers and Embedded Devices,” Mar. 2019. [Online].
[Online]. Available: https://os.mbed.com/blog/entry/utensor-and- Available: https://doi.org/10.5281/zenodo.2589394
tensor-flow-announcement/ [171] Titov, Nikita and Zeigerman Iaroslav and Yershov Viktor and
[149] AI Technology & Systems. (2021) deepC. [Online]. Available: others. (2022) m2cgen. [Online]. Available: https://github.com/
https://github.com/ai-techsystems/deepC BayesWitnesses/m2cgen
[150] ONNX. (2019) Open Neural Network Exchange: ONNX. [Online]. [172] L. Tsutsui da Silva, V. M. A. Souza, and G. E. A. P. A. Batista, “Embml
Available: https://onnx.ai/ tool: Supporting the use of supervised learning algorithms in low-cost
[151] N. Rotem, J. Fix, S. Abdulrasool, S. Deng, R. Dzhabarov, J. Hegeman, embedded systems,” in 2019 IEEE 31st International Conference on
R. Levenstein, B. Maher, N. Satish, J. Olesen, J. Park, A. Rakhov, Tools with Artificial Intelligence (ICTAI), 2019, pp. 1633–1637.
and M. Smelyanskiy, “Glow: Graph lowering compiler techniques [173] Salerno, Simone. (2022) MicroML. [Online]. Available: https:
for neural networks,” CoRR, vol. abs/1805.00907, 2018. [Online]. //github.com/eloquentarduino/micromlgen
Available: http://arxiv.org/abs/1805.00907 [174] F. Sakr, F. Bellotti, R. Berta, and A. De Gloria, “Machine learning on
[152] W.-F. Lin, D.-Y. Tsai, L. Tang, C.-T. Hsieh, C.-Y. Chou, P.-H. Chang, mainstream microcontrollers,” Sensors, vol. 20, no. 9, 2020. [Online].
and L. Hsu, “Onnc: A compilation framework connecting onnx to Available: https://www.mdpi.com/1424-8220/20/9/2638
proprietary deep learning accelerators,” in 2019 IEEE International [175] A. P. Singh and S. Chaudhari, Embedded Machine Learning-
Conference on Artificial Intelligence Circuits and Systems (AICAS), Based Data Reduction in Application-Specific Constrained IoT
2019, pp. 214–218. Networks. New York, NY, USA: Association for Computing
[153] T. Chen, T. Moreau, Z. Jiang, H. Shen, E. Q. Yan, L. Wang, Y. Hu, Machinery, 2020, p. 747–753. [Online]. Available: https://doi.org/
L. Ceze, C. Guestrin, and A. Krishnamurthy, “TVM: end-to-end 10.1145/3341105.3373967
optimization stack for deep learning,” CoRR, vol. abs/1802.04799, [176] D. Morawiec. (2022) weka-porter. [Online]. Available: https://
2018. [Online]. Available: http://arxiv.org/abs/1802.04799 github.com/nok/weka-porter
[154] A. Demidovskij, Y. Gorbachev, M. Fedorov, I. Slavutin, A. Tugarev, [177] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, The WEKA
M. Fatekhov, and Y. Tarkan, “Openvino deep learning workbench: Workbench. Online Appendix for ”Data Mining: Practical Machine
Comprehensive analysis and tuning of neural networks inference,” Learning Tools and Techniques”, 4th ed. Morgan Kaufmann, 2016.
in 2019 IEEE/CVF International Conference on Computer Vision [Online]. Available: https://www.cs.waikato.ac.nz/ml/weka/Witten et
Workshop (ICCVW), 2019, pp. 783–787. al 2016 appendix.pdf

---------------------------------------------------------------------------- 331 ----------------------------------------------------------------------------

Authorized licensed use limited to: Northeastern University. Downloaded on February 06,2025 at 23:57:26 UTC from IEEE Xplore. Restrictions apply.

You might also like