Practical Attestation For Edge Devices Running Compute Heavy Machine Learning Applications-3485832.3485909
Practical Attestation For Edge Devices Running Compute Heavy Machine Learning Applications-3485832.3485909
323
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
environments within the same device: (1) the untrusted environ- road traffic using deep neural networks (DNN and LSTM) algo-
ment (or normal world) running the operating system (OS) with rithms on vehicle-mounted low-cost IoT devices. TrafIoT utilizes
the application software, and (2) the trusted environment (or secure the background subtraction and optical flow-based computer vision
world) executing functions which cannot be altered by any software methods for the traffic density estimation and uses Reinforcement
running in the untrusted environment. In this architecture, the at- Learning based intersection control [12]. FaceIoT recognizes the hu-
testation is carried out in the secure world to validate that the kernel man face in a captured image and matches it against the stored valid
running in the normal world has not been tampered [1, 2, 19, 20, 35]. faces of individuals using one DNN for face detection and a second
In the conventional software attestation (CSA) schemes [49], the DNN for face matching. We highlight that the post-deployment
secure world performs the integrity assessment by computing a attestation on edge devices is effective only if, before deployment,
cryptographic hash of the current kernel and comparing it with the the edge computing software is verified according to the privacy,
pre-stored gold hash of the deployed kernel. We highlight that in security, and other specifications. Thus, we also statically analyze
CSA, the integrity check of the entire kernel has to be conducted and verify these requirements of the three EdgeML applications us-
in each attestation event. This is because attesting parts of the ing Java Object-sensitive ANAlysis (JOANA). We show that JOANA
software sequentially in different attestation events can leave the successfully verifies our IoT benchmarks with no false negatives
system vulnerable to a roving malware which can switch its location and a limited number of false positives that are manually vetted.
among the unattested parts of the kernel to avoid detection [4, 11]. We demonstrate empirically that the application performance
Unfortunately, computing the cryptographic hash of the entire and security trade-offs are not adequately balanced in CSA and its
kernel takes millions of clock cycles which can translate into a trivial extensions in the above mentioned edge computing applica-
significant execution time in a typical IoT device. For instance, tions. The conventional attestation mechanism gives precedence to
on a Raspberry Pi 3 Model B board running at 1.2 GHz, utilizing the security over the performance of the device, i.e., the integrity of
Raspbian OS in the normal world and OPTEE kernel in the secure the software is monitored with the desirable granularity by stalling
world, CSA takes around 2 seconds. the application execution leading to lower application performance.
In CSA, if the inter-attestation time (i.e., the time between two We also consider an attestation scheme called Shadow-Box [35]
attestation events) is long, the attacker cannot be detected if it as our baseline. Shadow-Box considers the application’s perfor-
can compromise the kernel, carry out the malicious activities and mance at the cost of security. Our results validate that PracAttest
remove its trace in between the attestation events [9, 53]. How- provides 50x-80x speedup in the attestation time compared to this
ever, if the inter-attestation time is short, the device cannot find baseline, achieving the desired security requirements without sig-
enough time to run its applications. Specifically, it is impractical nificantly affecting EdgeML application performance (Section 6).
to utilize CSA in the IoT devices running a compute-heavy appli- Our EdgeML benchmarks and the PracAttest framework have been
cation, e.g., traffic intersection control using continuous real-time open-sourced for field adoption and further optimizations by the
image processing with a deep neural network algorithm. Moreover, research community1 . Our major contributions are as follows.
it is impossible to implement CSA for devices running EdgeML
algorithms for safety-critical applications, which cannot be halted. • We design and build PracAttest, a novel runtime attesta-
In this paper, we propose PracAttest, an attestation framework tion framework that triggers the attestation of a randomly
that employs the TEE architecture to facilitate a novel performance- selected segment of the OS kernel at a randomly selected
vs-security trade-off in IoT devices running extremely compute- time. This skillful combination of the kernel segmentation
intensive EdgeML workloads (Section 4). In PracAttest, the OS and randomization enables PracAttest to realize short inter-
kernel is divided into short segments. Then, in each runtime attes- attestation time (giving very little opportunity for any soft-
tation event, PracAttest randomly selects one segment and attests ware tampering to go undetected) without significantly de-
it. It then determines an attestation interval bound based on the grading the device’s EdgeML application performance.
CPU usage of the application. It finally selects the inter-attestation • We present three real-world EdgeML benchmarks PolIoT,
time randomly between zero and the attestation interval bound. TrafIoT, and FaceIoT, that use state-of-the-art deep neural
PracAttest triggers the next attestation event after this random networks (e.g., DNN and LSTM) and other computer vision
inter-attestation time. algorithms (e.g., background subtraction and optical flow).
We point out that the kernel segmentation allows PracAttest Since the post-deployment software attestation is typically
to attest any segment independently and check its integrity. The facilitated along with the pre-deployment software verifica-
application-driven determination of inter-attestation time brings tion of the security and privacy requirements, we verify our
forth the trade-off between the device security and the applica- EdgeML benchmarks for the sake of completeness.
tion performance. Further, the runtime randomization in the inter- • Using the three realistic workloads, we empirically show the
attestation time and in selecting the kernel segment introduces shortcomings in balancing security-vs-performance trade-off
unpredictability in exactly when the attestation is triggered and by existing attestation schemes and highlight the advantages
what segment is attested, respectively. It makes it difficult for an of PracAttest on actual Raspberry Pi as an edge device.
attacker to monitor the attestation events and target the device
during the time windows when the attestation is not happening.
To evaluate the efficacy of PracAttest, we develop three real-
world EdgeML workloads – PolIoT, TrafIoT, and FaceIoT (Section 5).
PolIoT measures air pollution data and correlates pollution with 1 https://github.com/iabidi/attestation
324
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
325
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
Figure 1: Comprehensive system architecture for securing IoT devices. The one-time pre-deployment verification of security
requirements is conducted based on the policies specified by the IoT device developers and users (e.g., the EdgeML data servers
and clients). The runtime attestation of the deployed software (running in the normal world) is conducted with the help of
the hardware root of trust (in the secure world). The attacker aims to compromise the software running in the normal world.
326
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
selected segment. After the deployment, when an attestation event that the value of the duration of one attestation event Te can be
is triggered at runtime, the secure world performs the following. determined experimentally for an IoT device and is agnostic to the
(1) It takes over the control of the device halting any EdgeML application.
application running in the normal world. During multiple attestation events spread over time, PracAttest
(2) It randomly selects only one segment among all the kernel is able to randomly select and attest multiple kernel segments. This
segments, and attests that segment. This addresses the first decreases the impact of attestation on the application performance,
open issue of what part of kernel to attest. but it also decreases the system security, i.e., the ability to detect
(3) It retrieves the last CPU usage sample recorded in the normal any malignant kernel segments. To analyze the impact of PracAttest
world before the start of the attestation event. on security, we provide the following probabilistic guarantee. Let
(4) It defines an attestation interval bound which is set to be the kernel memory be divided into n equal segments. Also, let
directly proportional to the value of the CPU usage sample. there be k malignant segments. Over a time period, PracAttest runs
(5) It selects the inter-attestation time randomly between zero l attestation events in which it randomly selects l segments for
and the attestation interval bound. The next attestation event the attestation. Out of these l segments, PracAttest fails to detect
is triggered after this inter-attestation time. This addresses the attack, i.e., the intrusion by malware, if it does not detect any
the second open issue of when to attest the kernel. malignant segment. Let the probability of failure of PracAttest in
(6) Finally, it grants the control back to the normal world re- detecting the attack be denoted by P f . Now we consider two types
suming any stalled application. of malware: roving and non-roving malware.
We highlight that in each attestation event, only one kernel seg-
4.1.1 Roving Malware. A roving malware can first infect a kernel
ment is attested. The inter-attestation time is determined based on
segment, carry out malicious actions, move to another segment,
the attestation interval bound which in turn is directly proportional
and restore the previous segment to its benign state. By utilizing
to the CPU usage. This ensures that the inter-attestation time is
the capability of moving among the kernel segments stealthily, the
small when the CPU usage is low, i.e., the attestation is mostly
malware can attempt to avoid detection by PracAttest. In this case,
performed when the device is not running the EdgeML application.
the probability of failure can be expressed as:
The randomization of the inter-attestation time and the kernel seg-
ments introduces unpredictability in exactly when the attestation
n −k l
is triggered and what segment is attested, respectively. This makes Pf = . (1)
n
it difficult for an attacker to guess and target specific kern/el seg-
ments of the device during the time windows when the attestation 4.1.2 Non-Roving Malware. A malware that remains static in in-
is not happening. In this way, PracAttest resolves the two afore- fected segments, and does not move between two kernel segments
mentioned issues of when and what to attest while enabling an is called a non-roving malware. In this case, the probability of the
advantageous coexistence between the attestation and application failure of PracAttest in detecting the malware can be expressed as:
execution. Below, we elaborate on the design choices undertaken
in the runtime mechanisms of PracAttest. The notations utilized in n−k
l (n − k)! · (n − l)!
this paper are presented in Table 2. Pf = = . (2)
n
l
(n − k − l)! · n!
Table 2: Notation utilized for the parameters in PracAttest. It is clear from both the above equations that as the number of
selected segments l increases or the number of malignant segments
Notation Description k increases, the probability of failure P f decreases.
n Number of kernel segments
k Number of malignant segments
l Number of segments selected for attestation
4.2 Determining Inter-Attestation Time
Pf Probability of failure to detect malware A typical EdgeML application performs real-time sensing, makes
Td Duration of an instance of the application certain computations, stores the relevant data, and finally commu-
Tm Maximum attestation interval for the device nicates its decisions to the concerned entities. This sense-compute-
δ Design parameter affecting the application store-communicate cycle can either be periodic or event-driven
Te Duration of an attestation event based on the nature of the application. In a periodic EdgeML ap-
tb Attestation interval bound plication, typically, there exists a sleep instruction to control the
ta Inter-attestation time data processing frequency and avoid device throttling. On the other
hand, in an event-driven EdgeML application, the CPU is usually
not busy before and after the event. PracAttest performs a fine-
4.1 Selecting Kernel Segment grained CPU usage sampling to record such patterns to find the
In PracAttest, the OS kernel is not attested at once but rather in appropriate low CPU usage windows to attest the kernel segments.
segments. During an attestation event, the cryptographic hash of PracAttest needs to halt the application and switch to the secure
a randomly selected segment is matched with the corresponding world to attest the kernel segment. Hence, to limit the impact on the
gold hash. If an adversary makes any change in the selected kernel application, the time for the attestation and the time for retrieving
segment, the attestation fails, and the attack is detected. Let the the CPU usage samples must be selected skillfully. To address this
time taken in each attestation event be denoted by Te . We note challenge systematically, PracAttest proceeds as follows.
327
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
4.2.1 Setting Maximum Attestation Interval. PracAttest selects the attestation interval bound is selected. Similarly, if the CPU usage is
inter-attestation time randomly between zero and an attestation low, a small attestation interval bound is selected.
interval bound. The maximum attestation interval, denoted by Tm ,
is defined as the attestation interval bound when the CPU usage is 4.2.4 Randomizing Inter-Attestation Time. We point out that deter-
detected to be 100%. Let the mean inference time of an instance of mining the inter-attestation time using only the CPU usage value
the application’s EdgeML algorithm running without any interrup- brings forth a critical security vulnerability. In this case, since the at-
tion be denoted by Td . The time Td can be observed experimentally tacker can also track the CPU usage of the device, it can easily guess
by running the application. Then, the maximum attestation interval the inter-attestation time. To exploit this vulnerability, the attacker
is selected based on this mean inference time such that Tm = δ · Td . can execute the modified software right after an attestation event
Here, δ is a design parameter such that 0 ≤ δ ≤ 1. This design when the application is supposed to execute. To avoid detection, it
ensures that at least one attestation event is triggered during each can bring back the benign state of the software right before the next
instance of the application, but the extension in the application’s attestation event. To prevent such an attack, PracAttest also uses a
mean inference time due to the attestation remains limited. pseudo-random number generator along with the CPU usage sam-
pling to determine the inter-attestation time. The inter-attestation
4.2.2 CPU Usage Sampling. The CPU usage sampling allows Pra- time, denoted by ta , is selected randomly from the uniform dis-
cAttest to attest the kernel by following the execution profile of tribution between zero and the attestation interval bound tb . As
the application. This way, when the application is not running, seen from Figure 3, when the CPU usage is low, faster attestation
PracAttest can aggressively attest the kernel segments. When the takes place (denser purple lines), as small ta is selected. We note
application is being executed, it halts the application only briefly. that this randomness in inter-attestation time is bounded by the
Due to the architectural limitations, the processor can be either maximum attestation interval Tm . The random inter-attestation
in the normal world or in the secure world. Since the application time mitigates the predictability of the attestation. Hence, even if
runs in the normal world and the CPU usage of the application can the adversary obtains the CPU usage pattern or tweaks the CPU
only be recorded when the application is running, the sampling usage, the inter-attestation time remains unpredictable.
is performed in the normal world (as shown in Figure 3). In Pra-
cAttest, the CPU usage samples are collected in the normal world
through by an application and then retrieved by the secure world 5 VERIFIED EDGEML BENCHMARKS
to determine the inter-attestation time. To assess the impact of attestation on edge computing applications,
we need real-world workloads that can run on edge devices. These
Secure World workloads will help us evaluate the efficacy of state-of-the-art attes-
Slow Attestation Faster Attestation tation mechanisms, and then evaluate PracAttest’s advantages over
Attestation such baselines. Here, we describe three such workloads and discuss
under PracAttest verification of some desirable properties of these applications.
328
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
40 40 40
20 20 20
0 0 0
0 2 4 6 8 10 12 0 1 2 3 4 5 6 0 10 20 30 40 50 60
Time (s) Time (s) Time(s)
(a) Air pollution measurement (PolIoT) (b) Traffic signal control (TrafIoT) (c) Event-driven face identification (FaceIoT)
DNN-based image inference result, and one SVM-based IMU infer- face into a known or unknown face. If the face is known, it will
ence result every two seconds. Figure 4a presents the CPU usage record the person’s details in the database, such as name and time
of the device running PolIoT. of entry. However, if the face is unknown, it will raise an alert. We
use FaceIoT as a representative event-driven EdgeML application.
Figure 4c presents the CPU usage when running FaceIoT.
Table 3: Concurrent threads running in PolIoT.
329
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
highly contentious. For instance, whether the urbanization should non-interference IFC policy is defined such that one sensor node is
happen at the expense of the green cover [26, 65, 66], whether pol- considered as a source and other as a sink, and vice-versa.
luting industries should be shut down causing unemployment [16,
28, 38], whether farmers should incur economic losses to dispose
server=⊤
of crop residues using non-polluting means [22, 23, 56], or whether
on-road private vehicles should be reduced causing commuter hard-
ships [24, 36, 64]. In this context, guaranteeing that the PolIoT
software does not favor particular sides in a policy debate is nec-
essary. For instance, the software should not deliberately reduce PM GPS BME IMU camera
the PM values when the GPS value indicates that the device is near
favored industries while boosting the PM values near industries tar-
geted for shutdown. Ensuring non-interference across the sensors’
data in the software is therefore needed. trusted=⊥
Software Vulnerability Check: Our EdgeML benchmarks are
safety-critical in nature, whether it is signal control at intersec- Figure 5: Lattice for PolIoT non-interference requirements
tions (TrafIoT) or intrusion detection in buildings (FaceIoT). Data (any information flow between sensor nodes is considered
or code integrity violations or malware attacks on the system run- illegal).
ning these applications can lead to safety hazards. Therefore, it is
important to check for the presence of known vulnerabilities in
Among various tools that we explored [25, 45, 52], JOANA is
the software [27, 32, 70]. As there are many existing tools [13, 14]
the one that works directly on Java bytecode. Hence, it is conve-
for checking software vulnerabilities like buffer overflow/under-
nient to use for large external software libraries that are needed to
flow or string formatting, we do not discuss these specifications in
implement realistic EdgeML applications. JOANA can verify both
detail. We instead focus on the data privacy and non-interference re-
sequential as well as multi-threaded programs. It is one of the sound
quirements, which need some analysis using off-the-shelf software
open-source verification tools available and requires few annota-
verification tools.
tions. The above-mentioned properties make it a good choice for
our verification requirements. The application executable is pro-
5.2.2 Verification of Privacy and Non-Interference Properties. We
vided as the input to JOANA, where the IFC policies are specified.
employ the static information flow analysis which is a standard
JOANA then raises alerts based on all violations of the IFC poli-
method for software verification [59]. Typically, the information
cies it finds in the executable. The developer has to iteratively go
flow control (IFC) is modeled in a system by defining the start point
through these alerts and fix them unless they are false positives
of the information flow in a program as the source and the end point
(vetted in collaboration with the respective stakeholders). When all
as the sink. There are several methods to define information flow
true positive alerts are fixed, the executable is copied to the edge
policies. One method is to label the data, variables and expressions
devices for deployment. As shown in Appendix A for the sample
in the program with security levels. These levels are then modelled
case of PolIoT application, JOANA correctly catches all data privacy
as a lattice [17]. The lattice provides a way to check flows among
and non-interference violations at very low and manually verifiable
different variables of the program, flagging all flows between higher
false positive rates.
to lower security levels as forbidden.
Data Privacy: To model this property, we select a lattice with two
levels {low, hiдh} representing the low and high privacy require-
6 EVALUATION
ments, respectively. We define that the information labeled as low In this section, we will evaluate our post-deployment recurring
is only allowed to flow into the information labeled as hiдh, and not attestation mechanism for compute-heavy EdgeML workloads.
vice versa. This lattice is used to classify the sources and sinks in
our software into the two privacy levels. Specifically, the data and 6.1 Experimental Setup
the deployment partner’s server URL are considered as the source The experiments are performed on Raspberry Pi 3B board with quad-
and sink, and labeled with low and hiдh privacy level, respectively. core ARM Cortex A53 processors @1.2 GHz and 1 GB RAM. The
Non-Interference: To model the non-interference property, we Linux OS attestation scheme, PracAttest, is implemented using the
employ the aforementioned lattice-based information flow verifi- virtualization. We run Raspbian Linux OS in the normal world and
cation. Here, we demonstrate it in the context of the PolIoT appli- OPTEE kernel in the secure world [67]. The Linux IMA is configured
cation. We model the lattice as shown in Figure 5. In the figure, to check the application integrity. PolIoT and FaceIoT benchmarks
each sensor datatype is given a separate label. The information are implemented using OpenCV DNN APIs, and TrafIoT benchmark
labeled as trusted is only allowed to flow into the information la- utilizes the OpenCV Computer Vision Library APIs. The C++ APIs,
beled as GPS and server , and not vice versa. Similarly, for all other called with Java wrapper applications, are utilized for enabling soft-
sensors, the information can flow from trusted to any of the sensor ware verification with JOANA on the Java bytecodes. The verified
label and the server label. Any flow between two sensor labels EdgeML software runs in the normal world Linux, while OPTEE
is illegal. For example, setting of the PM value based on the GPS in the secure world runs PracAttest to ensure Linux integrity. To
value would lead to an illegal flow between the PM and GPS in the realize PracAttest, the Linux kernel of size around 8.5 MB is divided
resultant dependence graph. To check these types of illegal flows, a into 2130 segments each of size 4 KB.
330
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
Probability of failure (Pf)
331
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
CDF
CDF
0.4 0.4 0.4
PolIoT (CPU sampling) TrafIoT (CPU sampling) FaceIoT (CPU sampling)
0.2 PolIoT (Inter-attestation 0.2 TrafIoT (Inter-attestation 0.2 FaceIoT (Inter-attestation
time randomization) time randomization) time randomization)
0.0 0.0 0.0
0 20 40 60 80 100 0 20 40 60 0 50 100 150
Inter-attestation time (ms) Inter-attestation time (ms) Inter-attestation time (ms)
(a) PolIoT (b) TrafIoT (c) FaceIoT
Attestation Scheme Mean Attestation Time/SD (s) Mean Inference Time/SD (s) Malware
PolIoT TrafIoT FaceIoT PolIoT TrafIoT FaceIoT Detection Rate
No Attestation - - - 0.9 0.53 1.36 0
Conventional Software Attestation [49] 2 2 2 - - - 1
Shadow-Box [35] 701/147 947/120 18.9/2 1.04/0.3 0.58/0.1 1.45/0.2 1
against Non-Roving Malware
PracAttest without Time and Segment Randomization 35.2/1.8 29.7/1.4 43.6/1.8 0.96/0.2 0.54/0.1 1.39/0.1 0
against Roving Malware (k/n = 0.5%, l /n = 100%)
PracAttest without Segment Randomization 19.1/1.9 16.2/1.0 20.4/1.4 0.95/0.2 0.54/0.1 1.39/0.2 0
against Roving Malware (k/n = 0.5%, l /n = 100%)
PracAttest 14.3/2 12.2/0.8 15.4/1.1 0.95/0.2 0.54/0.1 1.4/0.2 1 − 5 · 10−4
against Roving Malware (k /n = 0.5%, l /n = 75.5%)
PracAttest 14.3/2 12.2/0.8 15.4/1.1 0.95/0.2 0.54/0.1 1.4/0.2 1 − 10−8
against Non-Roving Malware (k /n = 0.5%, l /n = 75.5%)
6.5 PracAttest Performance-Security Trade-off PracAttest significantly improves the mean attestation times of
Table 4 shows the security and performance metrics for the three periodic applications such as PolIoT and TrafIoT. In Table 4, we
benchmark applications in different attestation schemes, including also observe the impact of the successive design choices utilized in
Conventional Software Attestation (CSA) [49], Shadow-Box [35] PracAttest. For instance, we consider PracAttest without the time
and PracAttest. In CSA, the entire kernel is attested in each at- and segment randomization while using the inter-attestation time
testation event. While only one gold hash (of length 256 bits) of ta proportional to the CPU usage with a bound Tm on maximum
the entire kernel is needed in CSA, PracAttest must use 66 KB of values based on application. This scheme gives 20x and 33x improve-
secure memory to store the gold hash values corresponding to 2130 ment over Shadow-Box in terms of the mean attestation time for
kernel segments. PracAttest achieves advantage of application per- PolIoT and TrafIoT, respectively. We also consider PracAttest with-
formance at the cost of a mild increase in secure memory usage out segment randomization while randomizing the inter-attestation
and attestation time. We also consider Shadow-Box as a reasonable time. This scheme gives 37x-59x improvement over Shadow-Box.
baseline that considers the application performance at the cost of Finally, we observe that PracAttest gives 50x-80x improvement,
security. Shadow-Box samples CPU usage every second and cate- with a very high probability of catching the malware. While Table 4
gorizes the measured CPU usage to three different levels: 0 − 30%, gives the value of the selected metrics for a particular value of l/n
31 − 70% and 71% − 100%. Then, the inter-attestation time is se- in segment randomization, Figure 9 shows how mean attestation
lected as 5 msec, 500 msec, and 2 sec corresponding to the three time grows with l/n for a roving malware.
levels, respectively. In a periodic application like PolIoT and TrafIoT, The event-driven application, FaceIoT, has plenty of idle win-
sampling every second means that the sample will fall in the high dows, as shown in Figure 4(c), which can be utilized for aggressive
CPU usage value with high probability before hitting a low CPU kernel attestation. Therefore, Shadowbox and PracAttest have the
usage value. This coarse-grained CPU usage monitoring misses low same order of mean attestation times for FaceIoT. PracAttest’s se-
CPU usage windows quite frequently. Hence, the inter-attestation curity advantages are thus more pronounced when the EdgeML
times remain high, giving a high mean attestation time. Most im- application is compute-heavy, with small idle windows as in PolIoT
portantly, interrupting the attestation to sample the CPU usage and TrafIoT.
makes Shadow-Box vulnerable to the roving malware. We also note that PracAttest slightly improves the mean in-
ference time compared to Shadow-Box for all three benchmark
332
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
REFERENCES
20
Mean attestation time (s)
[1] Tigist Abera, N Asokan, Lucas Davi, Jan-Erik Ekberg, Thomas Nyman, Andrew
PolIoT Paverd, Ahmad-Reza Sadeghi, and Gene Tsudik. 2016. C-FLAT: control-flow
15 TrafIoT attestation for embedded systems software. In Proceedings of the ACM SIGSAC
Conference on Computer and Communications Security. 743–754.
FaceIoT [2] Tigist Abera, Raad Bahmani, Ferdinand Brasser, Ahmad Ibrahim, Ahmad-Reza
Sadeghi, and Matthias Schunter. 2019. DIAT: Data Integrity Attestation for
10 Resilient Collaboration of Autonomous Systems. In NDSS.
[3] Shimaa Ahmed, Amrita Roy Chowdhury, Kassem Fawaz, and Parmesh Ra-
manathan. 2020. Preech: A system for privacy-preserving speech transcription.
5 In 29th {USENIX } Security Symposium ( {USENIX } Security 20). 2703–2720.
[4] Muhammad Naveed Aman, Mohamed Haroon Basheer, Siddhant Dash, Jun Wen
0 Wong, Jia Xu, Hoon Wei Lim, and Biplab Sikdar. 2020. HAtt: Hybrid remote
attestation for the Internet of Things with high availability. IEEE Internet of
0 20 40 60 80 100 Things Journal 7, 8 (2020), 7220–7233.
Percent of segments selected for attestation (l/n) [5] Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein,
Jaime Cochran, Zakir Durumeric, J Alex Halderman, Luca Invernizzi, Michalis
Kallitsis, et al. 2017. Understanding the mirai botnet. In 26th {USENIX } security
Figure 9: Mean attestation time for varying ratio of random symposium ( {USENIX } Security 17). 1093–1110.
[6] Sebastian P Bayerl, Tommaso Frassetto, Patrick Jauernig, Korbinian Riedhammer,
segments and total segments (n = 2130) required for hashing Ahmad-Reza Sadeghi, Thomas Schneider, Emmanuel Stapf, and Christian Weinert.
to achieve a fixed P f (≈ 0). 2020. Offline model guard: Secure and private ML on mobile devices. In 2020
Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 460–
465.
[7] Bloomberg. 2021. Retrieved Jun 1, 2021 from https://www.bloomberg.com/
applications. Thus the tremendous security benefits (in terms of news/articles/2021-03-09/hackers-expose-tesla-jails-in-breach-of-150-000-
security-cams
the mean attestation time) do not come at a cost to application [8] Ferdinand Brasser, David Gens, Patrick Jauernig, Ahmad-Reza Sadeghi, and
performance (in terms of the mean inference time), but on the Emmanuel Stapf. 2019. SANCTUARY: ARMing TrustZone with User-space
Enclaves.. In NDSS.
contrary, application performance also improves from choosing [9] Sergey Bratus, Nihal D’Cunha, Evan Sparks, and Sean W Smith. 2008. TOCTOU,
carefully when to attest. We point out that the slight increase in the traps, and trusted computing. In International Conference on Trusted Computing.
inference latencies are well within our benchmarks’ requirements, 14–32.
[10] Xavier Carpent, Karim Eldefrawy, Norrathep Rattanavipanon, Ahmad-Reza
as seen from the first row in Table 4, where only the applications Sadeghi, and Gene Tsudik. 2018. Reconciling remote attestation and safety-
are run without any attestation mechanism in place. critical operation on simple IoT devices. In 2018 55th ACM/ESDA/IEEE Design
Automation Conference (DAC). IEEE, 1–6.
[11] Xavier Carpent, Norrathep Rattanavipanon, and Gene Tsudik. 2018. Remote
7 CONCLUSION AND FUTURE WORK attestation of IoT devices via SMARM: Shuffled measurements against roving
malware. In 2018 IEEE International Symposium on Hardware Oriented Security
In this paper, we propose PracAttest, a practical OS kernel attes- and Trust (HOST). 9–16. https://doi.org/10.1109/HST.2018.8383885
tation scheme for edge devices running compute-heavy EdgeML [12] Sachin Chauhan, Kashish Bansal, and Rijurekha Sen. 2020. EcoLight: Intersection
applications. Unlike the conventional software attestation scheme, Control in Developing Regions Under Extreme Budget and Network Constraints.
Advances in Neural Information Processing Systems 33 (2020).
which provides security at the cost of the application performance, [13] Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A Tool for Checking
PracAttest brings forth an advantageous security-vs-performance ANSI-C Programs. In Tools and Algorithms for the Construction and Analysis of
trade-off. We also present three EdgeML benchmarks and verify Systems (TACAS 2004) (Lecture Notes in Computer Science, Vol. 2988), Kurt Jensen
and Andreas Podelski (Eds.). Springer, 168–176.
their data privacy and non-interference requirements with zero [14] Lucas Cordeiro, Pascal Kesseli, Daniel Kroening, Peter Schrammel, and Marek
false negatives and acceptable false positives. Through these bench- Trtik. 2018. JBMC: A Bounded Model Checking Tool for Verifying Java Bytecode.
In 30th International Conference on Computer Aided Verification.
marks, we demonstrate that our attestation tool, PracAttest, gives [15] Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. IACR Cryptol.
50x-80x improved runtime for kernel attestation over state-of-the- ePrint Arch. 2016, 86 (2016), 1–118.
art baseline, at negligible overhead on the ML application perfor- [16] Nandini Dasgupta. 2015. Tall Blunder. Retrieved Apr 12, 2019 from https:
//www.downtoearth.org.in/coverage/tall-blunder-22419
mance. With edge devices becoming an integral part of our lives, [17] Dorothy E Denning. 1976. A lattice model of secure information flow. Commun.
our practical immediately deployable attestation mechanism, Pra- ACM 19, 5 (1976), 236–243.
cAttest, can play an important role in securing ML at the edge. [18] Don Kurian Dennis, Yash Gaurkar, Sridhar Gopinath, Chirag Gupta, Moksh Jain,
Ashish Kumar, Aditya Kusupati, Chris Lovett, Shishir G Patil, and Harsha Vard-
In the future, we plan to explore the dynamic attack scenarios han Simhadri. 2020. EdgeML: Machine Learning for resource-constrained edge
where attesting the static code (as done in this paper) does not devices. URL https://github.com/Microsoft/EdgeML (2020).
[19] Ghada Dessouky, Tigist Abera, Ahmad Ibrahim, and Ahmad-Reza Sadeghi. 2018.
suffice [2, 57, 63]. Such attacks could potentially be detected by Litehax: lightweight hardware-assisted attestation of program execution. In
monitoring the control flow path, when an EdgeML application IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 1–8.
runs. However, it is impractical to continuously explore and an- [20] Ghada Dessouky, Shaza Zeitouni, Thomas Nyman, Andrew Paverd, Lucas Davi,
Patrick Koeberl, N Asokan, and Ahmad-Reza Sadeghi. 2017. Lo-fat: Low-overhead
alyze all valid control paths in the secure memory of the edge control flow attestation in hardware. In Proceedings of the 54th Annual Design
device. For detecting such attacks, we plan to examine the poten- Automation Conference. ACM, 24.
tial of a PracAttest-like mechanism involving random selection [21] Jian Ding and Ranveer Chandra. 2019. Towards low cost soil sensing using Wi-Fi.
In The 25th Annual International Conference on Mobile Computing and Networking.
and inspection of control flow paths. The design of such a secu- 1–16.
rity mechanism will again focus on minimizing the impact on the [22] Down To Earth. 2018. Crop burning: Haryana farmers to launch a state-wide
protest. Retrieved Apr 12, 2019 from https://www.downtoearth.org.in/news/air/
EdgeML application’s performance. Another promising extension crop-burning-haryana-farmers-to-launch-a-state-wide-protest-61889
of our work is to investigate the feasibility of further minimizing the [23] Down To Earth. 2018. Crop burning: Why are Punjab farmers defying government
computational overhead of the remote attestation for very low-end ban. Retrieved Apr 12, 2019 from https://www.downtoearth.org.in/news/air/
crop-burning-why-are-punjab-farmers-defying-government-ban-61869
resource-constrained IoT platforms, e.g., wearables.
333
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
[24] Ecotech. 2016. Odd-Even Policy, Delhi, Explained. Retrieved Apr 12, 2019 from on Intelligent Transportation Systems (ITSC). IEEE, 1902–1907.
https://www.ecotech.com/odd-even-policy-delhi-explained [49] Pieter Maene, Johannes Götzfried, Ruan De Clercq, Tilo Müller, Felix Freiling,
[25] Marco Eilers and Peter Müller. 2018. Nagini: a static verifier for Python. In and Ingrid Verbauwhede. 2017. Hardware-based trusted computing architectures
International Conference on Computer Aided Verification. Springer, 596–603. for isolation and attestation. IEEE Trans. Comput. 67, 3 (2017), 361–374.
[26] The Indian Express. 2018. 14,000 of 21,000 trees to be axed for redevelopment [50] Larry W McVoy, Carl Staelin, et al. 1996. lmbench: Portable Tools for Performance
of south Delhi colonies: Govt. Retrieved Apr 12, 2019 from http://tinyurl.com/ Analysis.. In USENIX annual technical conference. San Diego, CA, USA, 279–294.
ybys6zro [51] Francesca Meneghello, Matteo Calore, Daniel Zucchetto, Michele Polese, and
[27] US Food and Drug Administration. 2017. Firmware Update to Address Andrea Zanella. 2019. IoT: Internet of threats? A survey of practical security
Cybersecurity Vulnerabilities Identified in Abbott’s (formerly St. Jude vulnerabilities in real IoT devices. IEEE Internet of Things Journal 6, 5 (2019),
Medical’s) Implantable Cardiac Pacemakers: FDA Safety Communication. 8182–8201.
Retrieved Feb 26, 2021 from https://www.fda.gov/medical-devices/safety- [52] Andrew C. Myers, Lantian Zheng, Steve Zdancewic, Stephen Chong, and
communications/firmware-update-address-cybersecurity-vulnerabilities- Nathaniel Nystrom. 2006. Jif 3.0: Java information flow. http://www.cs.cornell.
identified-abbotts-formerly-st-jude-medicals edu/jif
[28] Carnegie Council for Ethics in International Affairs. 2004. Workers’ Rights [53] Ivan De Oliveira Nunes, Sashidhar Jakkamsetti, Norrathep Rattanavipanon, and
and Pollution Control in Delhi. Retrieved Apr 12, 2019 from https://www. Gene Tsudik. 2020. On the TOCTOU problem in remote attestation. arXiv preprint
carnegiecouncil.org/publications/archive/dialogue/2_11/section_2/4451 arXiv:2005.03873 (to appear in CCS 2021) (2020).
[29] Jiahao Gao, Zhiwen Hu, Kaigui Bian, Xinyu Mao, and Lingyang Song. 2020. [54] Amitangshu Pal and Krishna Kant. 2019. Water flow driven sensor networks for
AQ360: UAV-aided air quality monitoring by 360-degree aerial panoramic images leakage and contamination monitoring in distribution pipelines. ACM Transac-
in urban areas. IEEE Internet of Things Journal 8, 1 (2020), 428–442. tions on Sensor Networks (TOSN) 15, 4 (2019), 1–43.
[30] Dennis Giffhorn. 2012. Slicing of Concurrent Programs and its Application to [55] Sandro Pinto and Nuno Santos. 2019. Demystifying Arm TrustZone: A compre-
Information Flow Control. Ph.D. Dissertation. Karlsruher Institut für Technologie, hensive survey. ACM Computing Surveys (CSUR) 51, 6 (2019), 1–36.
Fakultät für Informatik. [56] The Pioneer. 2017. Farmers protest Punjab Government’s orders. Retrieved
[31] Tiago Gomes, Sandro Pinto, Adriano Tavares, and Jorge Cabral. 2015. Towards Apr 12, 2019 from https://www.dailypioneer.com/2017/state-editions/farmers-
an FPGA-based edge device for the Internet of Things. In IEEE Conference on protest-punjab-governments-orders.html
Emerging Technologies & Factory Automation (ETFA). 1–4. [57] Davide Quarta, Marcello Pogliani, Mario Polino, Federico Maggi, Andrea Maria
[32] Google. 2019. PHA Family Highlights: Triada. Retrieved Feb 26, 2021 from Zanchettin, and Stefano Zanero. 2017. An experimental security analysis of an
https://security.googleblog.com/2019/06/pha-family-highlights-triada.html industrial robot controller. In IEEE Symposium on Security and Privacy (S&P).
[33] Jürgen Graf, Martin Hecker, and Martin Mohr. 2013. Using JOANA for Informa- IEEE, 268–286.
tion Flow Control in Java Programs - A Practical Guide. In Proceedings of the [58] Saeed Saadatnejad, Mohammadhosein Oveisi, and Matin Hashemi. 2019. LSTM-
6th Working Conference on Programming Languages (ATPS’13) (Lecture Notes in based ECG classification for continuous monitoring on personal wearable devices.
Informatics (LNI) 215). Springer Berlin / Heidelberg, 123–138. IEEE journal of biomedical and health informatics 24, 2 (2019), 515–523.
[34] Le Guan, Peng Liu, Xinyu Xing, Xinyang Ge, Shengzhi Zhang, Meng Yu, and [59] Andrei Sabelfeld and Andrew C Myers. 2003. Language-based information-flow
Trent Jaeger. 2017. Trustshadow: Secure execution of unmodified applications security. IEEE Journal on selected areas in communications 21, 1 (2003), 5–19.
with arm trustzone. In Proceedings of the 15th Annual International Conference on [60] Reiner Sailer, Xiaolan Zhang, Trent Jaeger, and Leendert Van Doorn. 2004. Design
Mobile Systems, Applications, and Services. 488–501. and Implementation of a TCG-based Integrity Measurement Architecture.. In
[35] Seunghun Han, Junghwan Kang, Wook Shin, HyoungChun Kim, and Eungki USENIX Security symposium, Vol. 13. 223–238.
Park. 2018. Shadow-BoxV2: The Practical and Omnipotent Sandbox for ARM. [61] Khaled Saleh, Mohammed Hossny, and Saeid Nahavandi. 2017. Driving behavior
Blackhat-ASIA (2018). classification based on sensor data fusion using LSTM recurrent neural networks.
[36] Deccan Herald. 2016. Delhi’s odd-even scheme has no impact: study. Retrieved In 2017 IEEE 20th International Conference on Intelligent Transportation Systems
Apr 12, 2019 from https://www.deccanherald.com/content/666902/delhis-odd- (ITSC). IEEE, 1–6.
even-scheme-has.html [62] Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge
[37] Ahmad Ibrahim, Ahmad-Reza Sadeghi, and Gene Tsudik. 2018. Us-aid: Unat- computing: Vision and challenges. IEEE Internet of Things Journal 3, 5 (2016),
tended scalable attestation of iot devices. In 2018 IEEE 37th Symposium on Reliable 637–646.
Distributed Systems (SRDS). IEEE, 21–30. [63] Zhichuang Sun, Bo Feng, Long Lu, and Somesh Jha. 2020. OAT: Attesting oper-
[38] Aditya Nigam in Revolutionary Democracy. 2001. Industrial Closures in Delhi. ation integrity of embedded devices. In 2020 IEEE Symposium on Security and
Retrieved Apr 12, 2019 from http://www.revolutionarydemocracy.org/rdv7n2/ Privacy (SP). IEEE, 1433–1449.
industclos.htm [64] Hindusthan Times. 2016. Air cleaner this April than last year, says body studying
[39] Trent Jaeger, Reiner Sailer, and Umesh Shankar. 2006. PRIMA: policy-reduced in- odd-even. Retrieved Apr 12, 2019 from https://tinyurl.com/y4uk9u47
tegrity measurement architecture. In Proceedings of the eleventh ACM symposium [65] Hindustan Times. 2018. 16,500 trees: A huge price for south Delhi’s redevelop-
on Access control models and technologies. ACM, 19–28. ment projects. Retrieved Apr 12, 2019 from https://tinyurl.com/y73te44m
[40] Jongmin Jo, Sucheol Jeong, and Pilsung Kang. 2020. Benchmarking GPU- [66] Hindustan Times. 2018. One tree cut every hour over last 13 years, says Delhi
Accelerated Edge Devices. In IEEE International Conference on Big Data and govt data. Retrieved Apr 12, 2019 from https://www.hindustantimes.com/delhi-
Smart Computing (BigComp). 117–120. news/one-tree-cut-every-hour-over-last-13-years-says-delhi-govt-
[41] Jair Ferreira Júnior, Eduardo Carvalho, Bruno V Ferreira, Cleidson de Souza, data/story-uJBiGcLemQIOCvIfP7rwpN.html
Yoshihiko Suhara, Alex Pentland, and Gustavo Pessin. 2017. Driver behavior pro- [67] TrustedFirmware.org. 2020. Retrieved Sep 21 ,2020 from https://optee.
filing: An investigation with different smartphone sensors and machine learning. readthedocs.io/_/downloads/en/3.9.0/pdf/
PLoS one 12, 4 (2017), e0174959. [68] Rohit Verma, Gyanesha Prajjwal, Bivas Mitra, and Sandip Chakraborty. 2018.
[42] Ashish Kumar, Saurabh Goyal, and Manik Varma. 2017. Resource-efficient ma- Mining spatio-temporal data for computing driver stress and observing its effects
chine learning in 2 KB RAM for the Internet of Things. In International Conference on driving behavior. In Proceedings of the 26th ACM SIGSPATIAL International
on Machine Learning (ICML). 1935–1944. Conference on Advances in Geographic Information Systems. ACM, 452–455.
[43] Ralf Küsters, Tomasz Truderung, Bernhard Beckert, Daniel Bruns, Michael [69] Xiaofei Wang, Yiwen Han, Victor CM Leung, Dusit Niyato, Xueqiang Yan, and Xu
Kirsten, and Martin Mohr. 2015. A hybrid approach for proving noninterference Chen. 2020. Convergence of edge computing and deep learning: A comprehensive
of Java programs. In 2015 IEEE 28th Computer Security Foundations Symposium. survey. IEEE Communications Surveys & Tutorials 22, 2 (2020), 869–904.
IEEE, 305–319. [70] Jianliang Wu, Yuhong Nan, Vireshwar Kumar, Dave Jing Tian, Antonio Bianchi,
[44] Aditya Kusupati, Manish Singh, Kush Bhatia, Ashish Kumar, Prateek Jain, and Mathias Payer, and Dongyan Xu. 2020. {BLESA }: Spoofing Attacks against
Manik Varma. 2018. FastGRNN: A fast, accurate, stable and tiny kilobyte sized Reconnections in Bluetooth Low Energy. In 14th {USENIX } Workshop on Offensive
gated recurrent neural network. In Proceedings of the 32nd International Conference Technologies ( {WOOT } 20).
on Neural Information Processing Systems (NIPS). 9031–9042. [71] Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient
[45] Shuvendu K Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. convolutional neural networks using energy-aware pruning. In Proceedings of
2012. Symdiff: A language-agnostic semantic diff tool for imperative programs. the IEEE Conference on Computer Vision and Pattern Recognition. 5687–5695.
In International Conference on Computer Aided Verification. Springer, 712–717. [72] Yuzhe Yang, Zhiwen Hu, Kaigui Bian, and Lingyang Song. 2019. ImgSensingNet:
[46] Matthew Leon. 2020. The Dark Side of Unikernels for Machine Learning. arXiv UAV vision guided aerial-ground air quality sensing system. In IEEE Conference
preprint arXiv:2004.13081 (2020). on Computer Communications (INFOCOM). 1207–1215.
[47] En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accel- [73] Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017.
erating deep neural network inference via edge computing. IEEE Transactions on DeepIoT: Compressing deep neural network structures for sensing systems with
Wireless Communications 19, 1 (2019), 447–457. a compressor-critic framework. In Proceedings of the 15th ACM Conference on
[48] Fu Li, Hai Zhang, Huan Che, and Xiaochen Qiu. 2016. Dangerous driving behavior Embedded Network Sensor Systems. 1–14.
detection using smartphone sensors. In 2016 IEEE 19th International Conference
334
Practical Attestation for Edge Devices Running Compute Heavy Machine Learning Applications ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA
335
ACSAC ’21, Dec 6–10, 2021, Virtual Event, USA Ismi Abidi, Vireshwar Kumar, and Rijurekha Sen
Table 6: Results demonstrating that the number of false positives start decreasing once we identify the cause.
from the file write operation. These exceptions are added when not become zero in spite of having no file writes. Our Java program
the Java byte code is converted to PDG for the analysis by JOANA. has three classes: IMU, Camera and PGB (for PM, GPS and BME)
The execution of the line 10 in Listing 3 depends on whether the running concurrently. They invoke the same class for getting the
file write is executed successfully or not. If it throws an exception, timestamps of the sensor readings. The timestamp is returned in a
the program will terminate. Therefore, JOANA shows a flow from local variable which is not shared between the threads. However,
GPS to BME, i.e., no BME data is obtained if the file write of GPS JOANA still reports an illicit flow from the source in the IMU class
is unsuccessful. Similar false positives have also been reported in to the sinks in the PGB and Camera classes. Similarly, the source
the existing literature [30]. This violation is not dangerous for our in Camera class shows illicit flows to the sinks in the other two
non-interference policy as sensors are still not affecting each others’ classes. JOANA merges the three invocations of the timestamp class
values and, therefore, safe to ignore. in different threads. We observe that if the timestamp is present in
Table 6 shows the number of violations detected by JOANA at just one of the threads, we get the last case (scenario F) in Table 6
the PDG level for the entire PolIoT software in six scenarios labeled with zero violations. Therefore, it is safe to ignore this false alarm.
A-F. We find that although the number of violations are high but In essence, IoT developers have to iteratively go through the
they emanated from just a few lines in the code. It can be seen from alarms raised by JOANA while fixing the true alarms and analyzing
Table 6 that once we start decreasing the number of file writes, the the false alarms to see if they can be safely ignored. As JOANA
number of violation also decreases. This validates that the file write catches all violations and has few, easy to analyze false positives, we
exceptions are the primary cause of the encountered false alarms. believe that this one-time manual process brings forth a significant
There is a second kind of false alarm raised by JOANA for PolIoT. security impact at an acceptable overhead.
As shown in the scenario E in Table 6, the number of violation does
336