Zhang 2020

This article proposes a method called the Long Short-Term Prediction Engine to predict failures in embedded systems using long short-term memory (LSTM) neural networks. It demonstrates this technique for predicting temperature behavior on a mobile system-on-chip to avoid overheating failures. The prediction engine uses two models - a short-term binary model for precise near-term predictions and a long-term regression model for predictions further in advance. An on-chip hardware implementation is proposed to enable non-intrusive failure prediction for embedded systems.

Uploaded by

0442- TARUN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views4 pages

Zhang 2020

Uploaded by

0442- TARUN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LES.2020.3007361, IEEE Embedded
Systems Letters

Predicting Failures in Embedded Systems using Long Short-Term Inference

Tianyi Zhang1, Minjun Seo2, Bryan Donyanavard2, Nikil Dutt2, and Fadi Kurdahi2
July 2, 2020

Abstract— Users of embedded and cyber-physical systems expect can provide proactive thermal management, we can avoid potentially
dependable operation for an increasingly diverse set of applications dangerous execution scenarios. Proaction requires prediction. A
and environments. Reactive self-diagnosis techniques either use number of strategies have been proposed for on-chip thermal prediction,
unnecessarily conservative guardbands, or do not prevent catastrophic
failures. In this letter we utilize machine-learning techniques to and the methods can be classified into two categories.
design a prediction engine in order to predict failures on-device in The first prediction method builds models based on measured
embedded systems. We evaluate our prediction engine’s effectiveness temperature and power consumption [21], [19], [14], [15], [17]. The
for predicting temperature behavior on a mobile system-on-chip, and second method builds the prediction model indirectly using equations,
propose a realizable hardware implementation for the use-case. without thermal measurements [5], [4], [7]. However, there have been
I. Introduction many successful applications of machine learning techniques employed
in failure detection or prediction of large-scale systems. With sufficient
The complexity of embedded system platforms and the applications sensor input, machine learning models can extract complex or subtle
they support are continuously increasing: they run large and evolving dynamics, potentially resulting in accurate predictions when applied
applications on heterogeneous multi- or many-core processing to new execution scenarios. Failure prediction has been proposed
platforms. Examples include automated and autonomous driving, smart using support vector machines (SVMs) [10], [3], convolutional neural
buildings, industry 4.0, and personal medical devices. Such systems networks (CNNs) [16], and a combination of techniques [8].
are required to provide dependable operation for the user while dealing RNNs are naturally suited for learning temporal sequences and
with a large number of internal and external variabilities, threats, and modeling time series behaviors. RNNs have been applied to predict
uncertainties in their lifetimes. various behavior in large-scale systems [20], [13], [6]. In [6], the
To provide such dependable operation, self-diagnosis techniques authors compare an RNN solution with an LSTM solution, and observe
are developed for early detection of degradation and imminent that LSTMs significantly outperform RNNs in terms of accuracy.
failures, in order to maximize system life-cycle. These techniques In [18], [2], [11] LSTMs are used in other domains for time series pre-
can be combined with unsupervised platform self-adaptation to meet dictions such as water quality estimation, stock transaction prediction,
performance and safety targets. Self-diagnosis techniques that are mechanical states, and more. The authors compare LSTM networks
reactive may (a) not be sufficient to address catastrophic failures, or with alternatives such as back propagation neural networks, online
(b) take overly conservative approaches that hinder performance. sequential extreme learning machines, and support vector regression
For example, consider thermal management of an embedded system- machines (SVRM), and demonstrate the superiority of LSTMs.
on-chip (SoC). One technique is to define a temperature threshold,
and throttle performance (e.g., via dynamic voltage-frequency scaling III. CONTRIBUTIONS
(DVFS)) when the threshold is exceeded. This approach is reactive and We propose a method for predicting runtime behavior in hardware:
must act conservatively to prevent overheating. The conservative fre- the Long Short-Term Prediction Engine. In this section, we describe how
quency throttling may degrade performance, potentially unnecessarily. our predictor is composed by walking through our use-case: predicting
If the temperature behavior could be predicted, a proactive approach runtime temperature behavior on an embedded system-on-chip. Our
could manage the temperature without sacrificing performance exces- goal is to predict temperature behavior such that critical thermal sce-
sively. However, system dynamics such as temperature can behave narios can be detected in advance and avoided, with a solution that can
nonlinearly, and are hard to predict without workload knowledge. feasibly be integrated in an embedded SoC. Our SoC consists of four
Machine learning techniques such as neural networks are useful ARM A15 cores, with shared L2 cache connected via bus. We measure
for identifying complex system dynamics. However, neural networks total power and temperature of the entire core cluster, as well as per-core
are complex and difficult to deploy on power-constrained embedded utilization. To generate workloads, we use a synthetic microbenchmark
systems. In this paper, we propose a failure prediction technique for [12] that is configurable. The microbenchmark is able to stress the
embedded systems using long short-term memory (LSTM), a type of architecture in a wide range and we generated a “general-purpose”
recurrent neural network (RNN). We demonstrate the effectiveness workload by executing the microbenchmark in phases that exercised
of our predictor for predicting temperature behavior with respect to different behavior in these various dimensions. We execute different
a threshold on an ODROID-XU3 [9] platform, making it a candidate sequences on multiple cores to emulate different applications to train
for mitigating overheating failures and implementing efficient control the model and test its performance. The prediction engine consists
policies. We specify an implementation that is realizible in hardware on of two parts: a short-term binary model and a long-term regression
low-power embedded systems. The specific contributions are as follows: model. The short-term binary model makes precise predictions quickly,
• We propose a method for hardware hazard prediction called Long
useful for subtle changes, i.e., anticipating violations of a temperature
Short-Term Prediction Model. threshold. The long-term regression model can make a prediction
• We propose an architecture and hardware implementation of
further in advance, useful to predict general behavior in less-critical
non-intrusive prediction engine based on Long Short-Term Prediction scenarios, i.e., predicting temperature trends in a safe state.
Model to predict temperature behavior in embedded systems.
• We evaluate the predictor using measured temperature data from A. Short-Term Binary Model
an ODROID XU-3. The short-term binary model is used to predict unwanted behavior,
II. Background and Related Work i.e., constraint violation. In our case in which we have a temperature
threshold we do not want to violate, the short-term binary model is
When modern systems-on-chip (SoCs) operate near peak utilized when the measured temperature is nearing the threshold. In
performance for extended periods, power dissipation can increase the this scenario, a slight rise in temperature will cause a failure (violation
temperature to the point that it adversely impacts chip reliability. If we of constraint), thereby it is important to have a high recall rate. The
1 Harbin Institute of Technology, 1162620312 at stu.hit.edu.cn recall rate must be tuned carefully to balance accuracy and overhead.
2 Center for Embedded and Cyber-physical Systems, UC Irvine, 1) Model Defintion
{minjun.seo,bdonyana,dutt,kurdahi} at uci.edu
*This work was partially supported by NSF Grant CCF-1704859 Our initial short-term binary model is defined as follows:

1943-0663 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: City, University of London. Downloaded on July 14,2020 at 08:07:55 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LES.2020.3007361, IEEE Embedded
Systems Letters

• Input: temperature, core utilization, power overhead. The LSTM internal structure is defined in the following
• Output: probability of failure (after boundary limitation, the model equations. x refers to input features, h is output result, W,b are weights
produces a binary result: ‘0’ refers to normal and number ‘1’ refers and bias and c are intermediate variables.
to failure) it =σ(Wxixt +Whiht−1 +bi) (7)
2) Model Training ft =σ(Wxf xt +Whf ht−1 +bf ) (8)
Figure 1 shows measured temperature data from the ODROID-XU3 ot =σ(Wx0xt +Wh0ht−1 +b0) (9)
1
. We first isolate the data above the critical point (85 ◦C) to use as c̃t =tanh(Wxcxt +Whcht−1 +bc) (10)
training data. Because the range of the data is reduced, we amplify ct =ft ct−1 +it c̃t (11)
the changes of data to increase its variation. When performing
amplification at runtime, we must consider constraints such as the ht =o tanh(ct) (12)
real-time hardware implementation and the short failure intervals. We Figure 3 (black and blue) illustrates the architecture of the proposed
create a method called Sliding Average Amplification to efficiently RNN/LSTM model which contains two RNN/LSTM layers (the
preprocess data in order to increase variation and applied it on the four RNN and LSTM structure provide comparable accuracy, shown in
features. The method takes local data (5 timesteps) and uses Min-Max Section IV), one fully connected layer, and one binary classification
Normalization to amplify the values. The following equations show the layer based on sigmoid activation. The input features are time sequences
calculation of Sliding Average Amplification. D(t) refers to the feature of temperature, per-core utilization, and power. After calculation of
value at t and i refers to the number of timesteps defined as local data. time step t in the first layer, the result is conveyed to step t + 1 the
1X
n same layer and the step t in the second layer. At the same time, step
average(t)= D(t−i) (1) t+1 data is added into the next step calculation. In each RNN/LSTM
n i=0 layer, there are 8 time steps and 64 hidden layers. In the last time step,
max(t)=MAX{D(t−i),D(t−i+1),...,D(t)} (2) the result is passed to a fully-connected layer and a sigmoid layer for
classification. The output result is the failure probability. When the
min(t)=MIN{D(t−i),D(t−i+1),...,D(t)} (3) value is greater than 0.9, we define it as failure and output 1.
D(t)−average(t)
amplified(t)= ×100 (4)
max(t)−min(t) Output
Figure 2 shows the amplified data along with the original. The orange
Sigmoid Layer 4
curve is the original data and the blue curve is the amplified data.
3) Improved Loss Function Fully Connected Layer 3
Our initial binary model still has a significant issue: it is trained
with imbalanced data. Normal samples (i.e., non-critical temperatures) ··· LSTM/Fully Connected Layer 2
account for nearly 99.5 % of the training data. Due to the low ratio
of failure samples (i.e., critical temperatures), the model is highly ··· LSTM Layer 1
Load4
confident in identifying critical samples, which is misleading. We Load3
Load2 Recycle Cell State
Load1
augment the classic binary cross-entropy loss function with weights in Power
Temperature
Input
order to increase model sensitivity to normal samples. y is the predicted
value and ŷ is the actual value. The weight factor α is determined
empirically based on the rate of failure samples in the training data. t Time Steps
Loss=−(αylogŷ+(1−α)(1−y)log(1−ŷ)) (5) Fig. 3: Integrated model structure. The structures are shared between
the short-term binary model and the long-term regression structure,
α=0.992 (6) depending on which is active. Functionality and structure specific
4) Model Structure to the short-term binary model is in blue, and specific to long-term
regression model is red.
We propose the simplest structure of an RNN prediction model
that provides the required accuracy in order to minimize the hardware
1 Our use-case system, containing the described SoC.
B. Long-Term Regression Model
The long-term regression model is used to predict behavior in the
Temperature (C)

90 normal state. In this state, temperature varies in a large range depending

80 on how the system is being exercised. Our goal is to predict the
temperature sufficiently in advance to make runtime decisions in order
70 to avoid critical states completely while also optimizing performance.
60
0 0.5 1 1.5 2 2.5 3 3.5 4 In order to proactively avoid critical states without unnecessarily
Time (s) sacrificing performance, it is necessary to ensure that the prediction
engine can be applied during normal execution. As the system state
Fig. 1: Temperature data collected from the ODROID XU-3 executing is non-critical, precision can be sacrificed for universality. To this end,
a combination of synthetic microbenchmarks. we build a regression model for long-term prediction.
Original Amplified 1) Model definition
100 • Inputs: temperature, power, per-core utilization
50 • Outputs: temperature
0
−50 2) Model Training
−100 In this case, we utilize a larger range of training data (60-85 ◦C).
0 0.5 1 1.5 2 2.5
Time (s) We observe temperature variation generally due to change in core
utilization and operating frequency. We categorize training workloads
Fig. 2: Temperature data amplified using Sliding Average Amplification. as following: unicore, multicore, and shifting. We execute combinations
We focus on data above 85 ◦C (critical temperature). of synthetic benchmarks to compose our workloads. The benchmarks

vary in instructions-per-cycle (IPC), utilization, and cache miss rate, where Pn and Rn are the precision and recall at the nth threshold. F1
exercising the processor in a wide range. score is a measure of a test’s accuracy, and is defined as the weighted
For unicore workloads, we first run each benchmark on one core to harmonic mean of the precision and recall of the test. F1 score conveys
emulate stable workload state. Then we combine multiple benchmarks a balance between precision (P ) and recall (R):
and start them one by one to emulate changing workload state on one 2×P ×R
core. For multicore workloads, we assign different benchmarks on F 1= (14)
P +R
different cores and start them simultaneously. For shifting workloads,
we assign the same benchmarks on different cores and start them at
Predicted Measured
different times. 1
Raw data collected from the ODROID-XU3 does not initially appear
stable, making filtering essential.2 After trying several filters to smooth 0
0 0.5 1 1.5 2 2.5 3 3.5 4
the raw data and considering the hardware feasibility, we conclude Time (s)
that the data preprocessed by recursion average filter produces the most
accurate model. Filter sizes of each input are determined empirically. Fig. 4: Sample prediction of one workload. Binary events (i.e.,
experiencing critical temperature) are predicted and observed.
3) Model Structure
LSTM has the nature of storing long-term memory, therefore, to 2) Evaluation Results
deal with the long-term cases, we choose LSTM structure for our
model. Compared to a short-term model, increased historical data is The model can predict up to 8 steps (40 ms) ahead. The F1 score is
needed to ensure precision when predicting a large temperature range 0.43 and the AP score is 0.78. The latency of short-term binary model
far in advance. This leads to increased model time step and execution is 0.088 ms (Based on execution in python, no hardware acceleration).
time. Therefore, we apply stateful LSTM theory in the cell structure, Figure 4 shows the prediction result of one dataset. The orange shows
fitting output cell state as the initial state. In this way, the structure measured failures and the blue shows predicted failures. Observe that
can remember long-term memory and better adapt. there are a number of mispredicted failures (false positives). This is
Figure 3 (black and red) illustrates the architecture of the proposed preferable to false negatives (non-predicted failures), as we are trying to
LSTM model. The input features are time sequences of temperature, anticipate and potentially avoid undesirable system state. In fact, in the
per-core utilization, and power. After calculation of step t, the cell state experiment shown in Figure 4, the recall value is 1, which means that
is recycled to next term calculation. There are 8 time steps in the LSTM all measured failures are predicted – i.e., we have no false negatives.
layer and 64 hidden layers in each cell. We need 16 previous steps a) Model Structure Tradeoffs
for prediction, therefore the cell state will be passed for initialization
every second iteration. To ensure practical utility of our hardware predictor in low-power
embedded systems, it is important to balance precision and complexity.
C. Hardware Implementation Framework Considering the feasibility constraints, we explore the impact of several
To integrate the short- and long-term models, we specify single hyper-parameters and layer structures on the model performance.
a shared-hardware implementation that supports all of Figure 3. A Parameters include RNN type, model structure, number of hidden
judgement module receives temperature values from the sensor and de- neurons, decimal digits, and number of time steps. We evaluate the
cides which model to activate. If temperature is ≥85 ◦C, the short-term RNNs and LSTMs based on AP, F1, recall (performance), runtime, and
prediction model is activated and its weights are loaded into the model degree of prediction. Figure 5 shows how different hyper-parameters
structure. If it is <85 ◦C, the long-term prediction model is activated. effect the model performance. The left y-axes measure AP score, F1
To reduce structural overhead, the core LSTM and fully connected score, and recall score. The right y-axes measure the time it takes to
layers are partially shared, composed with the least common parameters generate one prediction. The solid lines refer to the model with LSTM
(LSTM: 8 time steps, 64 hidden layers; fully connected: 4 hidden layers and the dotted lines refer to the model with RNN layers. Figure
layers). The excess time steps can be stored in a state buffer and fed 5a shows how the number and type of layers effect performance. It
back (Figure 3). indicates that LSTM has better accuracy. Prediction time increases
Using the LSTM implementation of Chang et al. [1], we calculate with the number of layers. Therefore, it is best to apply 2-layer
12960 FFs, 7201 LUTs, and 16 BRAM overhead. The LSTM hardware LSTM. Figure 5b shows how the number of previous timesteps effects
is 20 times faster than the Zync ZC7020 ARM-based hard-core proces- performance. After five timesteps, the accuracy plateaus and prediction
sor (4.4 µs per inference) , 44 times more power-efficient than a soft- time increases, therefore using five timesteps is the best choice. Figure
ware implementation with the Zync ZC7020 (performance-per-watt). 5c shows how the number of neurons effects performance. Accuracy
IV. Evaluation pleateus beyond 32 neurons, thus we choose 32 neurons in the network.
Figure 5d shows how the decimal digit influences performance. Two
We evaluate the effectiveness of both our Short-Term Binary digits is the minimum number to maintain accuracy. Figure 5e shows
Model and Long-Term Regression Model separately, using additional how accuracy degrades as the prediction moves further in advance.
measured data from the ODROID-XU3. The measured data consists of
the model input data measured at 5 ms intervals. We perform sensitivity B. Long-Term Regression Model
analyses of LSTM/RNN models for different parameters and structures. For the regression model, we use mean absolute error (MAE) to
A. Short-Term Binary Model Evaluation evaluate the accuracy, where yi is the predicted temperature k steps in
advance (Pi), and ŷi is the measured temperature at step i+k (Mi+k ):
1) Evaluation Metrics n
1X
The output of the short-term binary model is a binary classification. MAE = |Pi −Mi+k | (15)
We evaluate the model by average precision score (AP) and F1 score. n i=1
The average precision score summarizes a precision-recall curve as Figure 6 shows a sample time plot of one one experiment. The
the weighted mean of precision achieved at each recall threshold, with orange dashed line shows the measured temperature 64 steps (320 ms)
the increase in recall from theXprevious threshold used as the weight: in advance. The latency of the long-term regression model is 0.108 ms
AP = (Rn −Rn−1)Pn (13) (no hardware acceleration). The blue is the predicted temperature in
realtime. The MAE achieved by the predictor for 320 ms in advance is
n
0.018. The highest accuracy achieved by existing prediction methods is
2 Data is stored in a userspace buffer, sampled from sensors via kernel drivers every 0.024 MAE [17], and the longest prediction step is 500 ms [4], which
5ms. we improve by 25% and 36% respectively.

V. Conclusion average precision score. The long term model outputs temperature
We propose a new LSTM-based method for hardware hazard values up to 320 ms in advance with a MAE of 0.018. We simplify
prediction called Long Short Term Prediction Engine. The prediction the structure of the network and hyper-parameters to find one suited
engine uses two models to provide prediction of both urgent and for hardware realization, sharing parts of the network and automatically
normal conditions, which have different prediction requirements. switching between the two models according to temperature.
The integrated model is trained and tested on data collected on the
ODROID-XU3 platform. The short term model makes precise binary [1] A. X. M. Chang, B. Martini, and E. Culurciello, “Recurrent
prediction near critical conditions 40 ms in advance, and reaches 0.78 neural networks hardware implementation on fpga,” arXiv preprint
arXiv:1511.05552, 2015.
AP-LSTM F1-LSTM Recall-LSTM Latency-LSTM [2] Z. Chen, Y. Liu, and S. Liu, “Mechanical state prediction based on lstm
AP-RNN F1-RNN Recall-RNN Latency-RNN neural netwok,” in Chinese Control Conference, 2017.
1 0.2 [3] A. Chigurupati, R. Thibaux, and N. Lassar, “Predicting hardware failure
using machine learning,” in Reliability and Maintainability Symposium,
AP/F1/Recall

Time (ms)
0.8 0.15 2016.
0.6
0.4
0.1 [4] R. Cochran and S. Reda, “Consistent runtime thermal prediction and
5·10−2 control through workload phase detection,” in ACM/IEEE Design
0.2 Automation Conference, 2010.
0 0 [5] A. K. Coskun, T. S. Rosing, and K. C. Gross, “Utilizing predictors for
1 2 3
Layers efficient thermal management in multiprocessor socs,” IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems, 2009.
(a) Comparison for number of network layers. [6] F. D. d. S. Lima, G. M. R. Amaral, L. G. d. M. Leite, J. P. P. Gomes, and
1 0.2 J. d. C. Machado, “Predicting failures in hard drives with lstm networks,”
AP/F1/Recall

in Brazilian Conference on Intelligent Systems, 2017.

Time (ms)
0.8 0.15
0.6 [7] Y. Ge, Q. Qiu, and Q. Wu, “A multi-agent framework for thermal aware
0.4
0.1 task migration in many-core systems,” IEEE Transactions on Very Large
0.2 5·10−2 Scale Integration Systems, 2012.
0 0 [8] I. Giurgiu, J. Szabo, D. Wiesmann, and J. Bird, “Predicting dram
2 3 4 5 6 7 8 9 10 reliability in the field with machine learning,” in ACM/IFIP/USENIX
Timesteps Middleware Conference: Industrial Track, 2017.
[9] Hardkernel, “ODROID-XU,” Tech. Rep. [Online]. Available:
(b) Comparison for number of time steps considered in the network. http://www.hardkernel.com/main/main.php
1 0.2 [10] R. Kumar, S. Vijayakumar, and S. A. Ahamed, “A pragmatic approach
AP/F1/Recall

to predict hardware failures in storage systems using mpp database

Time (ms)

0.8 0.15
0.6 and big data technologies,” in IEEE International Advance Computing
0.1 Conference, 2014.
0.4
0.2 5·10−2 [11] S. Liu, G. Liao, and Y. Ding, “Stock transaction prediction modeling and
0 0 analysis based on lstm,” in IEEE Conference on Industrial Electronics
0 20 40 60 80 100 120 and Applications, 2018.
Neurons [12] T. Mück, S. Sarma, and N. Dutt, “Run-dmc: Runtime dynamic
(c) Comparison for number of neurons. heterogeneous multicore performance and power estimation for energy
efficiency,” in International Conference on Hardware/Software Codesign
1 and System Synthesis, 2015.
AP/F1/Recall

0.8 [13] S. Huang, C. Fung, K. Wang, P. Pei, Z. Luan, and D. Qian, “Using
0.6 recurrent neural networks toward black-box system anomaly prediction,”
0.4 in IEEE/ACM International Symposium on Quality of Service, 2016.
0.2 [14] S. Sharifi, D. Krishnaswamy, and T. S. Rosing, “Prometheus: A proactive
0 method for thermal management of heterogeneous mpsocs,” IEEE
1 2 3
Decimal Digits Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 2013.
(d) Comparison for number of decimal places (precision) used in the model. [15] G. Singla, G. Kaur, A. K. Unver, and U. Y. Ogras, “Predictive dynamic
thermal and power management for heterogeneous mobile platforms,”
1 in Design, Automation Test in Europe Conference Exhibition, 2015.
AP/F1/Recall

0.8 [16] X. Sun, K. Chakrabarty, R. Huang, Y. Chen, B. Zhao, H. Cao, Y. Han,

0.6 X. Liang, and L. Jiang, “System-level hardware failure prediction using
0.4 deep learning,” in ACM/IEEE Design Automation Conference, 2019.
0.2 [17] E. W. Wächter, C. de Bellefroid, K. R. Basireddy, A. K. Singh, B. M.
0 Al-Hashimi, and G. Merrett, “Predictive thermal management for energy-
8 16 32 64
Prediction Steps efficient execution of concurrent applications on heterogeneous multi-
cores,” IEEE Transactions on Very Large Scale Integration Systems, 2019.
(e) Comparison for various degrees of prediction (i.e., how many steps in [18] Y. Wang, J. Zhou, K. Chen, Y. Wang, and L. Liu, “Water quality
advance. One time step in our case is 5 ms. prediction method based on lstm neural network,” in International
Conference on Intelligent Systems and Knowledge Engineering, 2017.
Fig. 5: Sensitivity analysis of model structure. [19] B. Wojciechowski and J. Biernat, “Temperature prediction for multi-core
Predicted Measured
microprocessors with application to dynamic thermal management,” in
International Workshop on THERMal INvestigation of ICs and Systems,
78 2012.
Temperature (C)

76 [20] C. Xu, G. Wang, X. Liu, D. Guo, and T. Liu, “Health status assessment
74 and failure prediction for hard drives with recurrent neural networks,”
IEEE Transactions on Computers, 2016.
72 [21] I. Yeo, C. C. Liu, and E. J. Kim, “Predictive dynamic thermal
70 management for multicore systems,” in ACM/IEEE Design Automation
68 Conference, 2008.
0 2 4 6 8 10 12 14 16 18 20
Time (s)

Fig. 6: LSTM prediction accuracy for 64-step (320 ms) prediction,

compared to measured behavior.

In Company 30 Esp Corporate Finance Teachers Notes
100% (3)
In Company 30 Esp Corporate Finance Teachers Notes
53 pages
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
From Everand
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
LETSCOPE Lifecycle Extensions Through Software-Defined Predictive Control of Pow
No ratings yet
LETSCOPE Lifecycle Extensions Through Software-Defined Predictive Control of Pow
6 pages
LSTM 2
No ratings yet
LSTM 2
14 pages
Applsci 13 01099 v2
No ratings yet
Applsci 13 01099 v2
15 pages
Estimating Electric Motor Temperatures With Deep Residual Machine Learning
No ratings yet
Estimating Electric Motor Temperatures With Deep Residual Machine Learning
9 pages
JETIR2104196
No ratings yet
JETIR2104196
5 pages
Embedded Systems Programming with C++: Real-World Techniques
From Everand
Embedded Systems Programming with C++: Real-World Techniques
Robert Johnson
No ratings yet
Analog Dialogue, Volume 48, Number 2
From Everand
Analog Dialogue, Volume 48, Number 2
Analog Dialogue
No ratings yet
Energy Management Systems: Design and Implementation: Definitive Reference for Developers and Engineers
From Everand
Energy Management Systems: Design and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Temperature Prediction
No ratings yet
Temperature Prediction
16 pages
Fine-Grained Aging Prediction Based On The Monitoring of Run-Time Stress Using DFT Infrastructure
No ratings yet
Fine-Grained Aging Prediction Based On The Monitoring of Run-Time Stress Using DFT Infrastructure
35 pages
A Federated Learning-Based Industrial Health Prognostics For Heterogeneous Edge Devices Using Matched Feature Extraction
No ratings yet
A Federated Learning-Based Industrial Health Prognostics For Heterogeneous Edge Devices Using Matched Feature Extraction
15 pages
An Investigation of Exhaust Gas Temperature of Aircraft Engine Using LSTM
No ratings yet
An Investigation of Exhaust Gas Temperature of Aircraft Engine Using LSTM
10 pages
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Entropy: Multi-Sensor Vibration Signal Based Three-Stage Fault Prediction For Rotating Mechanical Equipment
No ratings yet
Entropy: Multi-Sensor Vibration Signal Based Three-Stage Fault Prediction For Rotating Mechanical Equipment
16 pages
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Embedded Systems Programming with C: Writing Code for Microcontrollers
From Everand
Embedded Systems Programming with C: Writing Code for Microcontrollers
Larry Jones
No ratings yet
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
From Everand
Litmus Chaos Experiments in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Comprehensive Guide to Arduino Systems: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Arduino Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Prometheus Operator on Kubernetes Essentials: The Complete Guide for Developers and Engineers
From Everand
Prometheus Operator on Kubernetes Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Deep Neural Network Modeling For Accurate Electric Motor Temperature Prediction
No ratings yet
Deep Neural Network Modeling For Accurate Electric Motor Temperature Prediction
6 pages
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Programming Atmel Microcontrollers: Definitive Reference for Developers and Engineers
From Everand
Programming Atmel Microcontrollers: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Electronics 11 02707 v2
No ratings yet
Electronics 11 02707 v2
13 pages
Embedded Systems Design Essentials: Definitive Reference for Developers and Engineers
From Everand
Embedded Systems Design Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Literature Survey
No ratings yet
Literature Survey
12 pages
Application of Machine Learning Techniques in Temperature Forecast
No ratings yet
Application of Machine Learning Techniques in Temperature Forecast
6 pages
These Twang
No ratings yet
These Twang
141 pages
Utilizing Predictors For Efficient Thermal Management in Multiprocessor Socs
No ratings yet
Utilizing Predictors For Efficient Thermal Management in Multiprocessor Socs
15 pages
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
From Everand
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sensors 23 08688
No ratings yet
Sensors 23 08688
20 pages
Bai 77
No ratings yet
Bai 77
11 pages
Machine Learning For Power, Energy, and Thermal Management On Multi-Core Processors: A Survey
No ratings yet
Machine Learning For Power, Energy, and Thermal Management On Multi-Core Processors: A Survey
17 pages
Generators 1
No ratings yet
Generators 1
101 pages
Real-Time Applications with FreeRTOS: Definitive Reference for Developers and Engineers
From Everand
Real-Time Applications with FreeRTOS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Researchpaper
No ratings yet
Researchpaper
9 pages
Mastering Embedded C: The Ultimate Guide to Building Efficient Systems
From Everand
Mastering Embedded C: The Ultimate Guide to Building Efficient Systems
Robert Johnson
No ratings yet
Netdata in Practice: Definitive Reference for Developers and Engineers
From Everand
Netdata in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Research Paper TARP Final Upload
No ratings yet
Research Paper TARP Final Upload
5 pages
Using Long-Short-Term-Memory Recurrent Neural Networks To Predict
No ratings yet
Using Long-Short-Term-Memory Recurrent Neural Networks To Predict
85 pages
Programming the MSP430 Microcontroller: Definitive Reference for Developers and Engineers
From Everand
Programming the MSP430 Microcontroller: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Combining Deep Learning and Survival Analysis For Asset Health Management
No ratings yet
Combining Deep Learning and Survival Analysis For Asset Health Management
7 pages
Research Proposal After Corrections.
No ratings yet
Research Proposal After Corrections.
6 pages
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Thundra Observability and Monitoring Solutions: Definitive Reference for Developers and Engineers
From Everand
Thundra Observability and Monitoring Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
1896-Document Upload-6001-1-10-20201102
No ratings yet
1896-Document Upload-6001-1-10-20201102
9 pages
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Application Performance Management in Modern Systems: Definitive Reference for Developers and Engineers
From Everand
Application Performance Management in Modern Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Prognosis Methodology Based On Behavioral Indicators and
No ratings yet
Advanced Prognosis Methodology Based On Behavioral Indicators and
13 pages
Overcoming Computational Errors in Sensing Platforms Through Embedded Machine-Learning Kernels
No ratings yet
Overcoming Computational Errors in Sensing Platforms Through Embedded Machine-Learning Kernels
12 pages
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Programming NodeMCU for IoT Applications: Definitive Reference for Developers and Engineers
From Everand
Programming NodeMCU for IoT Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AI Technology For NoC Performance Evaluation
No ratings yet
AI Technology For NoC Performance Evaluation
5 pages
Deep Residual Convolutional and Recurrent Neural Networks For Temperature Estimation in Permanent Magnet Synchronous Motors
No ratings yet
Deep Residual Convolutional and Recurrent Neural Networks For Temperature Estimation in Permanent Magnet Synchronous Motors
8 pages
A Survey of Artificial Neural Networks Based Fault Detection and
No ratings yet
A Survey of Artificial Neural Networks Based Fault Detection and
5 pages
Configuring Smart Devices with ESPHome: Definitive Reference for Developers and Engineers
From Everand
Configuring Smart Devices with ESPHome: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Resilience4j Architecture and Patterns: Definitive Reference for Developers and Engineers
From Everand
Resilience4j Architecture and Patterns: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Fault IC Detection
No ratings yet
Fault IC Detection
30 pages
Fundamentals of Microcontroller Architecture and Applications: Definitive Reference for Developers and Engineers
From Everand
Fundamentals of Microcontroller Architecture and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
2023 07 Dhingra Thesis 01
No ratings yet
2023 07 Dhingra Thesis 01
114 pages
Iot Embedded Projects List 2023-24
0% (1)
Iot Embedded Projects List 2023-24
6 pages
Ding 2020
No ratings yet
Ding 2020
14 pages
UNIT 4 & 5 Sat Comm 01-03-2022 10.57.04
No ratings yet
UNIT 4 & 5 Sat Comm 01-03-2022 10.57.04
54 pages
SC Unit-2 Part 2
No ratings yet
SC Unit-2 Part 2
26 pages
Propagation Effect On Satellite 08-01-2023 09.23
No ratings yet
Propagation Effect On Satellite 08-01-2023 09.23
15 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
69 pages
CAFT Unit 2 Notes
No ratings yet
CAFT Unit 2 Notes
59 pages
Esd 4
No ratings yet
Esd 4
40 pages
Python Mini Project
No ratings yet
Python Mini Project
30 pages
Financial Astrology An Unexplored Tool of Security Analysis
No ratings yet
Financial Astrology An Unexplored Tool of Security Analysis
9 pages
Comparing Numerial Methods To Estimate Vertical Jump Height PDF
No ratings yet
Comparing Numerial Methods To Estimate Vertical Jump Height PDF
9 pages
Measurement While Drilling Technique
No ratings yet
Measurement While Drilling Technique
9 pages
MlProject Cse 30 37
No ratings yet
MlProject Cse 30 37
27 pages
Groundwater Modelling Guidelines Final-2012
No ratings yet
Groundwater Modelling Guidelines Final-2012
385 pages
From Crystal Ball To Computer: Long-Range Forecasting
No ratings yet
From Crystal Ball To Computer: Long-Range Forecasting
24 pages
Scenario-Based SWA Evaluation Methods
No ratings yet
Scenario-Based SWA Evaluation Methods
12 pages
Leveraging Flask API and Machine Learning To Forecast Multiple Diseases
No ratings yet
Leveraging Flask API and Machine Learning To Forecast Multiple Diseases
13 pages
Electricity Price Prediction Based On LSTM and LightGBM
No ratings yet
Electricity Price Prediction Based On LSTM and LightGBM
5 pages
LMU - MSC Data Analytics
No ratings yet
LMU - MSC Data Analytics
20 pages
A Choice Theory of Planning
No ratings yet
A Choice Theory of Planning
6 pages
05253209
No ratings yet
05253209
6 pages
Data Science Paper
No ratings yet
Data Science Paper
8 pages
Flight Fare Predictor
No ratings yet
Flight Fare Predictor
21 pages
What Is Research Design Analysis
No ratings yet
What Is Research Design Analysis
3 pages
(IJCST-V12I3P5) :arjita Sable, Riya Gupta, Prof Aproov Khare, Prof Richa Shukla
No ratings yet
(IJCST-V12I3P5) :arjita Sable, Riya Gupta, Prof Aproov Khare, Prof Richa Shukla
6 pages
OMAE2018-78037: Validation of A Deterministic Wave and Ship Motion Prediction System
No ratings yet
OMAE2018-78037: Validation of A Deterministic Wave and Ship Motion Prediction System
8 pages
ML Final Report
No ratings yet
ML Final Report
40 pages
Business Forecasting & Time Series Analysis
No ratings yet
Business Forecasting & Time Series Analysis
28 pages
Sample: Predicting The Future - Fact or Fiction?
No ratings yet
Sample: Predicting The Future - Fact or Fiction?
4 pages
ICASP14 Stage-3827 question-FullPaper Id-472
No ratings yet
ICASP14 Stage-3827 question-FullPaper Id-472
8 pages
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
No ratings yet
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
67 pages
DLL Deofilo 3RD Q Eng
100% (1)
DLL Deofilo 3RD Q Eng
6 pages
The Concept of Risk
100% (2)
The Concept of Risk
42 pages
Optimize Safety Stock Levels With Next-Level Planning Software - RELEX Solutions
No ratings yet
Optimize Safety Stock Levels With Next-Level Planning Software - RELEX Solutions
15 pages
Loss On Ignition: Measuring Soil Organic Carbon in Soils of The Sahel, West Africa
No ratings yet
Loss On Ignition: Measuring Soil Organic Carbon in Soils of The Sahel, West Africa
8 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
2 pages
Lorenz, Chaos
50% (2)
Lorenz, Chaos
342 pages

Zhang 2020

Uploaded by

Zhang 2020

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

Predicting Failures in Embedded Systems using Long Short-Term Inference

90 normal state. In this state, temperature varies in a large range depending

in Brazilian Conference on Intelligent Systems, 2017.

to predict hardware failures in storage systems using mpp database

0.8 [16] X. Sun, K. Chakrabarty, R. Huang, Y. Chen, B. Zhao, H. Cao, Y. Han,

Fig. 6: LSTM prediction accuracy for 64-step (320 ms) prediction,

You might also like