20.state of Charge Prediction of EV Li Ion Batteries Using EIS A Machine Learning Approach
20.state of Charge Prediction of EV Li Ion Batteries Using EIS A Machine Learning Approach
Energy
journal homepage: www.elsevier.com/locate/energy
a r t i c l e i n f o a b s t r a c t
Article history: Due to the significantly complex and nonlinear behavior of li-ion batteries, forecasting the state of charge
Received 6 October 2020 (SOC) of the batteries is still a great challenge. Therefore, accurate SOC estimation is essential for the
Received in revised form proper operation of batteries while the battery is monitored by the battery management system (BMS).
29 December 2020
To this end, this paper employs informative measurements of electrochemical impedance spectroscopy
Accepted 12 February 2021
(EIS) in machine learning models (ML), i.e., linear regression model and Gaussian process regression
Available online 16 February 2021
(GPR), to accurately predict the SOC of li-ion batteries. First, a feature sensitivity analysis of the data is
conducted to extract the most reliable features, i.e., the EIS impedances which are highly correlated with
Index Terms:
Electric vehicle
SOC, from EIS measurements. Then, the models are fed by the chosen features. The models are designed
Electrochemical impedance spectroscopy to train the input features and establish the mapping relationship between the selected features and the
Li-ion batteries SOC. The results demonstrate that the error of the GPR model was found to be less than 3.8%. Considering
Machine learning onboard EIS measurements, this method can be practically embedded in the battery management system
for accurate measurements of SOC of li-ion batteries and ensure the proper and efficient operation of
battery-powered electric vehicles.
© 2021 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.energy.2021.120116
0360-5442/© 2021 Elsevier Ltd. All rights reserved.
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116
estimation for SOCs at sub-zero temperatures and SOCs below 10%. batteries with the state of health (SOH) between 60% and 100%. The
Therefore, due to the internal complex chemical reaction process reason that the battery’s degradation was not considered in this
and uncertain external operating conditions of batteries, modeling paper is due to the unavailability of EIS measurements for different
the batteries based on the ECM methods is challenging for esti- SOHs for the dataset utilized in the paper. In contrast to many other
mating the battery characteristics in real-life operation [13,14]. studies that only take into account the EIS data obtained at above-
Physic-based models (PBMs) demonstrate insights through chem- zero temperatures, this study considers the EIS data for both above-
ical and electrochemical dynamics, such as li-ion diffusion and zero and sub-zero temperatures, i.e., as low as 20 C. The results
Ohmic effects [13]. However, to estimate SOC using PBMs, partial demonstrate an error of less than 3.8% for the GPR model.
derivatives equations should be solved by the BMS controller, Considering the online and on-board EIS measurement [22e24],
which is a highly intensive computational burden [15]. Data-driven this method can be practically embedded in the BMS for accurate
models are only dependent on historical data, and they do not need measurements of SOC.
complicated equivalent or mathematical models. However, the The paper is organized as follows: in section 2, the electro-
challenge of data-driven models is the acquirement of informative chemical impedance spectroscopy measurement is fully explained.
inputs to construct a robust model for predicting the battery In the next section, the methodology for extracting reliable features
characteristics. Additionally, effective extraction of the features and building the prediction models based on linear regression and
from historical data still remains a challenging task [16]. In Ref. [17], GPR algorithm is discussed. Section 4 introduces the result of the
the SOC of the battery is predicted by a neural network (NN) which built-up models for predicting the SOC, and the last section is
utilizes voltage, current, temperature, and power of the battery as dedicated to the conclusions.
the input features. Prediction of SOC also has been conducted in
Ref. [18] employing NN and random forest/tree. Voltage, current, 2. Electrochemical impedance spectroscopy
and cycling number contribute as the inputs of the machine
learning (ML) black box in the mentioned paper. In another study EIS is a non-destructive and information-rich test which is
[19], support vector machine along with Gaussian methods esti- conducted by galvanostatic or potentiostatic excitation signal over
mate the SOC of the battery and extracts feature variables based on a wide range of frequency to obtain the impedance of the battery
the charging curve. However, all of these data-driven models that during charging and discharging [25]. The excitation signals in
use terminal voltage as input feature may lose the accuracy as the galvanostatic and potentiostatic methods are commonly sinusoidal
terminal voltage of battery suddenly drops at the end of discharge current and voltage and the corresponding response will be voltage
which accordingly does not provide reliable data for low SOCs [20]. and current, respectively. Based on these waveforms, the electro-
Thus, identifying and extraction of reliable features become the chemical impedance of the battery can be calculated. The imped-
main bottleneck of the adoption of the data driven approaches and ance of the battery is obtained based on the following equations in
thus, more research is required in this regard. galvanostatic mode [26]:
On the other hand, EIS measurements over a wide range of
frequency provide rich information about the dynamic character- DI ¼ Imax sinð2pftÞ; (1)
istics of the battery and pave the way for precise estimation of the
battery status. Nevertheless, none of the reviewed papers have DV ¼ Vmax sinð2pft þ ∅ Þ; (2)
adopted the EIS measurements directly as input data for machine
learning models to predict SOC except for [21], in which the EIS data Vmax j∅
obtained for SOCs above 30% and at room temperature have been Zðf Þ ¼ e ; (3)
Imax
utilized in a deep NN. The model does not employ the EIS data in a
wide range of temperature and at different SOC points [21], while where DI is a sinusoidal current at frequency f, which is super-
such a data exclusion decreases the accuracy and reliability of the imposed on the dc charging or discharging current and results in
model. Also, the reported error of the model of [21] is less than 5%. DV and phase angle ∅. Accordingly, Eq. (3) shows that the battery’s
This study investigates the effectiveness of the EIS measurement impedance is frequency-dependent and characterized by its
data for estimating the SOC of the li-ion batteries using machine magnitude and phase angle. Fig. 1 indicates a typical EIS spectrum.
learning techniques. In opposition to Ref. [21], which uses the The horizontal axis indicates the real part of the impedance, and
whole EIS impedances from the EIS spectrum to estimate SOC, only the vertical axis shows the negative of the imaginary part of the
highly correlated EIS impedances with SOC are used in this paper. impedance. The EIS spectrum is drawn over a wide range of
The proposed method’s advantages are higher accuracy of the
models and lower computational burden by eliminating irrelevant
input features, i.e., EIS impedances with low correlations. There-
fore, highly correlated impedances are first identified and then
extracted from EIS spectrum measurements obtained at SOCs from
0% to 100%. The chosen impedances are utilized as input features
for the linear regression model and Gaussian process regression
(GPR). The models are designed to train the input features and
establish the mapping relationship between the selected fre-
quencies and the SOC. Finally, the trained models are employed to
achieve SOC prediction.
Moreover, since the machine learning algorithm is neither
dependent on the model of the battery nor the method that the
battery is charged/discharged, and only the input and output of the
dataset matter here, the model can predict the SOC by interpolating
or extrapolating the dataset, regardless of charging or discharging
mode of the battery. The SOC can be precisely estimated for aged
batteries if the EIS measurements dataset is available for degraded Fig. 1. Typical EIS spectrum of li-ion battery.
2
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116
In this study, the experimental data from Ref. [27] have been 3. Methodology
utilized, where a Panasonic NCR18650PF lithium-ion battery, an
NCA chemistry cell similar to the cells used in Tesla’s electric cars This section is dedicated to the feature sensitivity analysis to
[28], was tested. The battery specifications have been presented in capture the highly correlated EIS features, i.e., highly correlated EIS
Table 1. In the test, EIS measurements were conducted over SOCs impedances with SOC of the battery, and then the selected reliable
from 0% to 100% and temperature range of 20 C e 25 C for features are utilized for training and testing of the machine learning
frequency sweep of 1 mHze6 kHz. Fig. 2 shows the battery’s models.
EðX; YÞ EðXÞEðYÞ
rX;Y ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi; (4)
E X 2 EðXÞ2 : E Y 2 EðYÞ2
where E is the expected value operator, and X and Y are two random
variables. Fig. 4 shows the heatmap of the dataset [27] that we used
for this study at different temperatures. Fig. 4 is a 2-D graphical
representation that indicates the dependency of the features of the
dataset. The features in the used dataset are electrochemical im-
pedances at the corresponding frequencies that they were
measured. In this case, the number of the features is 54 since the
impedances were measured at 54 frequencies, sweeping from
1 mHz to 6 kHz. In Fig. 4, the correlation between the features
varies from 1 to 1. The positive correlation has been shown in the
spectrum of light to dark red, and the negative correlation has been
shown in the spectrum of light to dark blue. The positive and
negative correlations mean that the output varies in the same or
opposite direction of the input variables’ variations. The heatmap is
a symmetric figure; thus, the last row or the last column represents
the relation of the input variables, i.e., impedances at different
frequencies with the output, i.e., SOC of the battery. Fig. 4(a)e(e)
show the heatmap of the dataset at temperatures of (25 C) e
(20 C), respectively. Fig. 4 (a) shows that the first few features are
highly and negatively correlated with the SOC and these features lie
in the high and mid-frequency regions of the EIS spectrum. As also
depicted in Fig. 4 (a), it is apparent from Fig. 4(b)e(e) that as the
temperature decreases, some other features from mid-frequency
appear to be positively correlated with the SOC. Another remark-
Fig. 2. EIS spectrum of the battery at (a) SOC of 50% and different ambient tempera- able result deduced from Fig. 4 is that the low frequencies are
ture, (b) zoomed-in version of (a). significantly less correlated with the SOC. The reason is that
3
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116
Therefore:
where yi is the i-th case of the dependent variable Y, xij is the value
of the j-th independent variable (Xj) for the i-th case of the
dependent variable, b0 is the Y-intercept of the regression surface,
each bj is the slope of the regression surface with respect to variable
Xj, and finally ei is the random error component for the i-th case. In
each equation in Eq. (5) the error is distributed with zero mean and
standard deviation, and it is independent of the errors in the other
equations. Since the variables are fixed quantities, the randomness
of Y results from the randomness of error terms in each equation;
although, in terms of correlation, the input variable are taken into
account random variables, and the input variables are independent
of the error terms. In matrix notation, Eq. (5) can be written as [30]:
Y ¼ Xb þ e (7)
where:
Fig. 3. EIS spectrum of the battery at (a) 25 C, (b) 0 C, and (c) 20 C and at different Y ¼ ½y1 y2 / yn T
SOC levels. b ¼ ½b1 b2 /bkþ1 T
2 3
x11 / x1ðkþ1Þ (8)
X ¼ 4« 1 « 5
xn1 / xnðkþ1Þ
4
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116
Fig. 4. Heatmap for feature sensitivity analysis of EIS spectrum at (a) 25 C. (b) 10 C, (c) 0 C, (d) 10 C, (e) 20 C.
and Y is the target vector, e is the error vector which is a column pairs of inputs xi , which may have one or more than one features,
vector of length n, and b is the vector of parameters, which is a and output yi , the GPR model computes the predictive distribution
column vector of length k þ 1. Matrix X is the input matrix, which is of unobserved test datasets with y* as output and x* as input [31]. In
n by k þ 1 matrix. To do prediction, b and e should be calculated. this study, X and Y are defined as X ¼ ½x1 ; …:; xn T and Y ¼
The structure of the regression model has been shown in Fig. 6.
½y1 ; …:; yn T , respectively. In this case xi ¼ ½EIS impedances is the
EIS impedances and the output yi is the SOC of the cells. It is also
3.3. Gaussian process regression (GPR) assumed that yi ¼ f ðxi þεi Þ where εi N ð0; s2 Þ is an independent
and identically distributed Gaussian noise. The outputs F ¼
For a given training dataset of T ¼ fðxi ; yi Þ; i ¼ 1; 2; …:; ng with n ðf ðx1 Þ þ… þf ðxn ÞÞ are modeled as Gaussian random field F
5
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116
1
D2 ¼ Kðx* ; x* Þ Kðx* ; XÞ KðX; XÞ þ s2 I KðX; x* Þ (11)
Fig. 5. PairGrid for reliable features of EIS spectrum at 25 C. 3.5. Accuracy evaluation
3.5.1. R-squared
Goodness-of-fit R-squared (R2 ) is defined as [32]:
2
Pn
i¼1 yi b
yi
R2 ¼ 1 P 2
(12)
n
i¼1 yi y
100 Xn
yi b yi
MAE ¼ (13)
n b
y i
i¼1
evaluation indices discussed in the previous section have been ML model Corr-value Temp(C) R_squared MAE RMSE
tabulated in Table 2 in different conditions. Linear Regression 0.5 25 (TS ¼ 0.2) 0.978 4.8456 5.9924
As mentioned earlier, some datasets have been introduced to 10 (TS ¼ 0.4) 0.74245 8.9317 20.098
the ML models with different portioning. The default portioning is 0 (TS ¼ 0.4) 0.62 17.444 20.098
that 80% of the dataset is dedicated to the training set, and 20% 10 (TS ¼ 0.2) 0.9894 2.9056 3.3387
20 (TS ¼ 0.2) 0.9825 4.230 4.3
(test_size (TS) ¼ 0.2) is dedicated to the test set. However, to see the
Linear Regression 0.7 25 (TS ¼ 0.2) 0.975 4.8456 5.9924
effect of TS, other values for this variable have been also taken into 10 (TS ¼ 0.4) 0.6983 9.5468 17.913
account. For corr_value of 0.5, different evaluation indices have 10 (TS ¼ 0.6) 0.9429 5.5629 8.4434
been obtained at different temperatures as presented in Table 2. 0 (TS ¼ 0.4) 0.7348 13.664 16.797
0 (TS ¼ 0.5) 0.8764 9.7318 11.481
The linear regression model can perfectly predict the battery’s SOC
10 (TS ¼ 0.2) 0.9877 3.0525 3.602
based on the values of R_squared, MAE, and RMSE at the mentioned 20 (TS ¼ 0.2) 0.977 4.9398 4.8428
temperatures except for 10 C and 0 C. The MAE for 25 C, 10 C, Linear Regression 0.9 25 (TS ¼ 0.2) 0.9524 6.2756 8.275
and 20 C temperatures is less than 4.9%, but for 10 C and 0 C, 10 (TS ¼ 0.2) 0.78241 15.237 17.694
the MAEs are 8.9% and 17.5%, respectively. Moreover, it is clear that 10 (TS ¼ 0.6) 0.899 6.9951 11.216
0 (TS ¼ 0.4) 0.4345 15.094 24.528
for TS of 0.4 the highest accuracy is achieved for temperatures 10 C
0 (TS ¼ 0.5) 0.8785 9.0877 11.382
and 0 C. As for the corr_value of 0.7, we can observe that for the 0 (TS ¼ 0.6) 0.9259 6.9159 8.877
temperatures of 25 C, 10 C, and 20 C, the evaluation indices 10 (TS ¼ 0.3) 0.9828 3.2369 3.5628
values have not changed, significantly. However, in the cases of 20 (TS ¼ 0.5) 0.875 6.5511 9.9834
10 C and 0 C temperatures, the improvement of evaluation 20 (TS ¼ 0.6) 0.93 6.11 6.8883
GPR 0.9 25 (TS ¼ 0.2) 0.998 1.3409 1.4602
criteria is noticeable such that the MAEs have reduced to 5.5% and 10 (TS ¼ 0.4) 0.9817 3.8178 4.4068
9.7%, respectively. Moreover, one may observe the influence of TS 0 (TS ¼ 0.3) 0.9036 8.6819 9.9630
on the mentioned temperatures, as the TS increases, an increase in 10 (TS ¼ 0.3) 0.9838 2.7223 3.4537
R_squared, and a reduction in MAE and RMSE are observed. 20 (TS ¼ 0.2) 0.9880 2.7493 3.5530
Considering the corr_value of 0.9, The MAE for all the cases is
achieved with a value of less than 7%. Since the extracted features
are reliable, it is expected that the MAE and RMSE decrease, but on Fig. 8 showcases SOC’s predicted values versus f1 and f2, among
the contrary, they increase. This is because when highly correlated the highly correlated features.
features are selected, most of the other features are lost, and the The study results demonstrate that in addition to identifying
machine learning model may lose accuracy if the dataset is not big and extracting reliable features, the learning ability of the model
enough. Thus, the performance of the model over a dataset is of and partitioning of the data for training are highly crucial for pre-
importance. Although the linear regression model functions prop- cise prediction. Considering the above-mentioned elements’ ef-
erly for corr_values of 0.9, with a maximum error of 7%, a more fects, we also observed that the GPR model outperforms the linear
accurate and reliable model, i.e., GPR, is used for this corr_value. regression model. The proposed method will be implemented in
The GPR model results for corr_value of 0.9, and the best TS have BMS for online measurement of EIS and SOC prediction utilizing the
been presented in Table 2. The MAE for 25 C, 10 C, and 20 C potential approaches from Refs. [22e24] such as fractional-order
temperatures is less than 2.8%, but for 10 C and 0 C, the MAEs are equivalent circuit model (FOECM) and pseudo-random sequences
3.8% and 8.7%, respectively. As an example, the training and test (PRS), which are fast and easily implementable for measuring EIS at
data and their predicted values at different temperatures have been low measurement time and low complexity.
shown in Fig. 8.
5. Conclusions
Fig. 8. SOC prediction of the proposed model at temperatures of (a) 25 C, (b) 10 C, (c) 10 C, (d) 25 C.
8
I. Babaeiyazdi, A. Rezaei-Zare and S. Shokrzadeh Energy 223 (2021) 120116