0% found this document useful (0 votes)
47 views27 pages

Enhanced Short

The document discusses improving short-term load forecasting accuracy for residential loads through a hybrid method combining empirical mode decomposition with temporal convolutional networks. It reviews related work on load forecasting using various statistical and machine learning methods. The proposed method decomposes the load signal before using deep learning models to capture peaks more effectively and enhance forecast accuracy, as evaluated on a public smart meter dataset.

Uploaded by

CC XEROX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views27 pages

Enhanced Short

The document discusses improving short-term load forecasting accuracy for residential loads through a hybrid method combining empirical mode decomposition with temporal convolutional networks. It reviews related work on load forecasting using various statistical and machine learning methods. The proposed method decomposes the load signal before using deep learning models to capture peaks more effectively and enhance forecast accuracy, as evaluated on a public smart meter dataset.

Uploaded by

CC XEROX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Enhanced Short-Term Load Forecasting: Integrating Empirical Mode Decomposition

with TCN based Deep Learning Techniques for Improved Accuracy

Abstract

In recent years, the residential load forecasting problem has been gaining renewed interest due to the advent

of smart meters and data analytics. These advancements have provided granular data and sophisticated tools

for analysis, making it possible to account for many factors that affect short-term electric load. The

superposition of these factors leads to the load being non-linear and non-stationary, presenting a complex

challenge that requires innovative forecasting approaches to accurately predict electricity consumption.

Separating different load components from the original load series can help to improve the accuracy of

prediction, but the direct modeling and predicting of the decomposed time series components will give rise

to multiple random errors and increase the workload of prediction. For performance comparison, two state-

of-the-art deep learning methods, namely gated recurrent unit (GRU) and long short-term memory (LSTM),

are selected. Results indicate that the proposed method effectively captures the peaks typically present in

residential loads, thereby enhancing forecast accuracy. Additionally, the performance of EMD-based models

improves when the test data exhibits more peaks. The evaluation is conducted using Smart*, a public dataset

containing residential load measurements.

Keywords: short-term load forecasting, smart meter data, data analytics, empirical mode decomposition,

TCN, LSTM, GRU.

1. Introduction

People live in the age of data analytics, where all decisions are data-driven. With unlimited

opportunities laid by data analytics applications, industries are now looking for ways to incorporate data

analytics into their operational strategies. The power sector is not an exception, looking for new strategies to

improve its operations and control. Fueling this intention further is the recent advancements in Automated

Metering Infrastructure (AMI) and the wide deployments of smart meters to residential customers
worldwide. Smart meters measure and communicate electrical consumption data from customer premises to

the energy provider at a stated interval stipulated by the Utility. The measurement data accumulated and sent

to the Utility over a period draws the attention of Data scientists to extract useful insight for the betterment

of both energy suppliers and energy consumers. For example, a Utility can attract customers to participate in

a demand response (DR) program by offering appropriate incentives to reduce their energy bills. An accurate

estimate of the energy bill in advance could help the customers’ budgeting. Statistical analysis and visual

presentation of energy consumption will enable the customers to understand their consumption behavior and

adjust their energy usage to reduce costs or contribute to DR programs. Load prediction is the key driving

factor of all these services. Any error in the prediction will lead to a significant loss for the Utility and its

customers. Hence, improving the prediction accuracy is always a crucial issue.

Load prediction is a forecast of future energy demand based on time series of past recordings of

energy consumption. Depending upon the forecast duration, load prediction is classified into short-term,

medium-term, and long-term [1]. A forecast duration ranging from a few minutes to a week in the future is

categorized as short-term. The work presented in this paper investigates the hourly forecast of residential

load and hence falls under the short-term category. The prediction of residential load will be robust if

forecasting is carried out for a short range of time in the future because residential loads are characterized by

a sudden rise and fall in electricity demand leading to the occurrence of peaks in a pattern. These peaks are

irregular, and their time of occurrence cannot be determined apriori because customers’ consumption

behavior is uncertain. This poses a challenge in modeling peaks as there is only a minimal dependency of

present consumption on its past data. Thus, the peaks have to be modeled precisely within a short time. Hence

short-term electricity forecasting is preferable in this case.

This paper experimented with three state-of-the-art deep learning (DL) methods, namely GRU,

LSTM, and TCN, to model the residential load signal. It is found from the results that the predictions at peaks

are not accurate. Authors of [3] also conclude that DL algorithms poorly predict peaks for hourly mean load

prediction. Hence, a novel hybrid method is proposed based on EMD with TCN to improve the short-term
forecast accuracy of residential loads. To the best of the knowledge, no research has attempted to use the

EMD in residential load prediction. The forecast covers four residential customers namely Home1 and

Home2, whose Smart meter measurement data is accessible publicly from the Smart* dataset available in the

UMass Trace Repository online [4]. Two sets of experiments are conducted for each DL algorithm, one using

the decomposed data by EMD and another without using EMD, namely original data.

The rest of the paper is organized into discussions on load forecasting problems (related works) in

section 2. Various steps and stages involved in data preparation and the model development process are

presented in section 3. A detailed discussion of experiments and the analysis of the results are reported in

section 4, and finally, the research conclusions are presented in section 5.

2. Related works

As the electric load series is non-linear, unstable, and relatively random, many models and methods

have certain limitations in short-term load forecasting [5]. Since the mid-20th century, various statistical-

based linear time series prediction methods have been proposed, and these methods generally need a precise

mathematical model to present the relationship between load and input factors. Haida [6] proposed a

regression-based daily peak load forecasting method and conversion technique. Khashei [7] predicted the

hourly load changes by establishing an autoregressive integrated moving average model (ARIMA). Holt [8]

used an exponentially weighted moving average prediction model to predict non-seasonal and seasonal series

with additive or multiplicative error structures. However, the forecasting model based on the statistical

method is relatively simple and requires high stability of load series, which cannot accurately reflect the non-

linear characteristics of load data. With the development of artificial intelligence technology, machine

learning methods, such as artificial neural networks (ANN), support vector machine (SVM), and random

forest (RF), have been widely used in the field of short-term load forecasting [9]. Reference [10] proposed a

short-term load forecasting method based on an improved variable learning rate backpropagation (BP) neural

network, and the experiment results show the method has high accuracy and real-time performance.
Fu Y [11] used the SVM to predict the hourly electricity load of a building and achieved good results.

In Reference [12], RF was used to predict the load for the next 24 hours, and the predicted performance of

the model was analyzed in detail. Although the machine learning method performs better in the nonlinear

relationship of the load series and has achieved good results in the field of load forecasting, there are still

some defects. Load series is a complex time series, and the machine learning method has poor processing

ability for timing features and requires manual filtrating of the timing features [13]. The flourishing

development of deep learning provides researchers with new ways to solve this problem. The DL method

mainly refers to the deep neural network which contains multiple hidden layers and has a specific structure

and training method. It has been widely used in many fields, such as speech recognition [14] and image

processing [15]. At present, it has also been discussed in the field of electricity load forecasting. Mocanu [16]

used the deep belief network composed of conditional restriction Boltzmann machine (CRBM) to predict the

load of a single residential building. Compared with the shallow artificial neural network and support vector

machine, the results improved a lot. Reference [17] established a predictive model based on the LSTM neural

network to predict the short-term electricity consumption of individual residential users.

Aowabin Rahman [18] used recurrent neural network (RNN) to predict hourly consumption of a

safety building in Utah and residential buildings in Texas, and the results have lower relative errors compared

to multilayer perceptron networks. Due to the limited learning ability of deep belief network for time series

features, RNN have been heatedly discussed in short-term load forecasting for their unique structure.

However, RNN has been proven to have the problems of gradient explosion and disappearance. Based on the

RNN, the GRU network solves the problem of gradient explosion and disappearance of RNN by adding the

gate structure to control the influence of the previous time [19], so that it can better process the time series.

In recent years, various combination models have been introduced to improve the accuracy of short-term

electricity load forecasting. Among them, the combination of signal decomposition method and machine

learning method has been widely studied [20]. Rana [21] used a wavelet neural network to decompose the

load series into sub-series with different frequencies, and then established a prediction model for each sub-

series, and obtained more accurate prediction results. However, it is necessary to choose the wavelet basis
function manually for the wavelet transform. EMD is another method of signal decomposition. Instead of

setting the basis function in advance, it can decompose the signal according to the characteristics of the data

itself, and the basis function is directly generated from the signal itself in the process of decomposition [22].

Each sub-series contains only part characteristic of the original load series, which makes it much simpler

than the original load series, so that more accurate prediction results can be obtained, and the EMD method

has been widely used in the field of load forecasting.

Guo [23] used the SVR and autoregression (AR) models to predict the high frequency and residual

components decomposed by EMD, respectively. Jatin [24] combined the EMD method with the LSTM model

to forecast the load demand for a given season and date and obtained better results than the single prediction

model. The hybrid models mentioned above are mainly different in the decomposition algorithm or the

prediction model, but the established process is almost the same. Most of the earlier works discussed here

had focused on load forecasting at the aggregate level like country, state, or city. They had little or no access

to fine-grained meter readings from residential customers, mainly due to device restrictions. Earlier research

works used reading measured by humans, usually at monthly or bi-monthly intervals. In recent times,

residential load forecasting has gained attention due to the arrival of smart meters and the availability of fine-

grained measurements from residential customers [25,26]. Smart meter data also contributes to one of the

Urban Flows, and the load forecasting using this data will add value in the context of Urban Computing, an

emerging area in Big Data Analytics [27–29].

This paper uses smart meter data collected at a frequency of 1 Hz from two residential buildings [30].

This research proposes a hybrid method using EMD in tandem with TCN to improve residential load forecast

prediction accuracy. EMD is a technique proposed by N. Huang decomposes a time series of non-linear and

non-stationary nature into several stationary time series called intrinsic mode functions (IMFs) and one

residue [31]. Since the residential load signal is non-linear and non-stationary, EMD can transform the

residential load signal into many IMFs and one residue. Then any ML algorithm can be used to model the

individual IMFs and the residue. The sum of predicted values obtained from the individual models will
produce the final forecast result. In this work, EMD with TCN, LSTM, and GRU are experimented with to

compare the performance of the proposed method.

In summary, the main contributions of this paper can be condensed as follows:

1) By using EMD to decompose the time series into IMFs and residue, the model captures both high-

frequency and low-frequency components, providing a detailed representation of the underlying

patterns in the data.

2) The TCN component leverages dilated convolutions and hierarchical temporal blocks to model

long-range dependencies in the data, addressing the limitations of traditional recurrent neural

networks (RNNs) and LSTMs in capturing temporal dynamics over extended periods.

3) The hybrid approach demonstrates superior performance in forecasting tasks, as evidenced by

lower error metrics root mean square error (RMSE), mean square error (MSE), and mean absolute

percentage error (MAPE) compared to standalone models. This accuracy is crucial for

applications such as power consumption forecasting.

3. Materials and methods

3.1 Empirical mode decomposition (EMD)

Empirical mode decomposition (EMD) is an adaptive signal processing technique developed by N.E.

Huang and his colleagues in the late 1990s. It is designed to analyze non-linear and non-stationary time series

data. Its ability to handle non-linear and non-stationary data makes it a powerful tool for modern signal

processing challenges. Unlike traditional methods such as Fourier and wavelet transform, which rely on

predefined basis functions, EMD operates directly on the data to decompose it into a set of Intrinsic Mode

Functions (IMFs). These IMFs are simple oscillatory modes, each capturing a specific frequency component

of the original signal. The primary advantage of EMD is its data-driven approach, which makes it particularly

effective for handling complex signals with time-varying characteristics. The workflow of the EMD process

is depicted in Fig. 1. The flowchart represents the steps involved in the EMD process to decompose a time

series power consumption data 𝑥(𝑡) into IMFs and a residue. The procedure begins with inputting the time
series data 𝑥(𝑡), which represents historical power consumption data. The first step involves identifying all

local maxima and minima points in the time series data. These points are then used to create upper 𝑢(𝑡) and

lower 𝑙(𝑡) envelopes through cubic spline interpolation, ensuring that the data's peaks and troughs are

accurately captured. Next, the mean envelope 𝑚(𝑡) is computed by averaging the upper and lower envelopes.

𝑢(𝑡) + 𝑙(𝑡)
𝑚 (𝑡 ) = (1)
2

The difference between the original time series data and the mean envelope denoted as 𝑑 (𝑡), is then

calculated to isolate the oscillatory component of the data, referred to as an IMF.

𝑑 ( 𝑡 ) = 𝑚( 𝑡 ) − 𝑥 ( 𝑡 ) (2)

The process checks if 𝑑 (𝑡) satisfies the conditions of an IMF, which requires that the number of zero-

crossings and extrema must either equal or differ at most by one and that the mean value of the envelope

must be zero. If 𝑑 (𝑡) qualifies as an IMF, it is extracted, and the residue, 𝑟(𝑡), is computed.

𝑟(𝑡) = 𝑥(𝑡) − 𝐼𝑀𝐹 (𝑡) (3)

The algorithm then assesses whether the residue is monotonic, meaning it has no more oscillatory

components that can be decomposed further. If the residue is not monotonic, the process repeats with the

residue as the new input time series data. This iterative process continues until all IMFs are extracted and the

final residue is monotonic, indicating the end of the decomposition.


Fig.1. Flowchart of EMD process

This systematic approach of decomposing the time series data into IMFs and a residue is crucial for

power forecasting, as it allows for the separation of different frequency components, making it easier to

analyze and predict future power consumption trends.

3.2 Temporal convolution network

This section presents a brief introduction to the computational technique TCN used in energy

consumption forecasts. TCNs are a type of convolutional neural network with a specific design that makes
them suitable for handling time series. TCNs satisfy two main principles: the network’s output has the same

length as the input sequence (similarly to LSTM networks), and they prevent leakage of information from

the future to the past by using causal convolutions [32]. Causal convolution is different from standard

convolution because the convolution operation performed to get the output at t does not accept future values

as inputs. This implies that, using a kernel size 𝑘, the output 𝑂𝑡 is obtained using the values of

𝑋𝑡−(𝑘−1) , 𝑋𝑡−(𝑘−2) , … . 𝑋𝑡−1 , 𝑋𝑡 . Zero-padding of length 𝑘 − 1 is used at every layer to maintain the same

length as the input sequence. This ensures consistency in the length of the input and output sequences across

the network layers.

(a) Standard convolution block with two

layers with kernel size 3

(b) Causal convolution block with two

layers with kernel size 3


(c) Dilated causal convolution block with two

layers with kernel size 2, dilation rate 2

Fig.2. Differences between (a) standard convolutional network, (b) causal convolutional network,

and (c) dilated causal convolutional network. Source [33]

Furthermore, intending to capture longer-term patterns, TCNs (Temporal Convolutional Networks)

use one-dimensional dilated convolutions. In the context of convolutional operations, dilation refers to

skipping specific values between the inputs, effectively increasing the receptive field of the network without

resorting to pooling operations. This approach prevents the loss of resolution, a common issue with

traditional pooling methods [34]. Dilated convolution involves inserting gaps between the values in the

convolutional kernel. As illustrated in Fig.2 (c), dilation skips "d" values between the inputs of the

convolutional operation. This spatial configuration allows the network to capture a broader context of the

input sequence. The complete dilated causal convolution operation over consecutive layers can be formulated

as follows [35]:
𝐾−1
(𝑡−(𝑘×𝑑))
𝑥𝑙𝑡 = 𝑔 (∑ 𝑤𝑙𝑘 𝑥(𝑙−1) + 𝑏𝑙 ) (4)
𝑘=0

where 𝑥𝑙𝑡 is the output of the neuron at position (𝑡) in the 𝑙-th layer; 𝐾 is the width of the convolutional

kernel; 𝑤𝑙𝑘 stands for the weight of position (𝑘); 𝑑 is the dilation factor of the convolution; and 𝑏𝑙 is the bias

term. Rectified linear units (ReLU) layers are used as activation functions [36], mathematically expressed

as,

𝑔(𝑥 ) = max(0, 𝑥) (5 )
In simple terms, 𝑔(𝑥)represents the output of the ReLU activation function for a given input 𝑥,

max(0, 𝑥) is the part of the equation that takes the maximum value between 0 and the input 𝑥. So, the ReLU

function essentially activates a neuron if the input is positive, allowing information to pass through, and

deactivates it (sets the output to 0) if the input is negative. This helps introduce non-linearity to the neural

network, enabling it to learn complex patterns and relationships in the data. ReLU is popular due to its

simplicity and effectiveness in training TCN networks.

Another common approach to further increase the network’s receptive field is to concatenate several

TCN blocks, as can be seen in Fig.3 [37]. However, this leads to deeper architecture with a significant

increase in parameters, which complicates the learning procedure. For this reason, a residual connection is

added to the output of each TCN block. Residual connections were proposed by [38] to improve performance

in very deep architectures, and consist of adding the input of a TCN block to its output

𝑜 = 𝑔(𝑥 + 𝐹(𝑥)) (6)

Where 𝑥 is the input to the TCN block, 𝐹(𝑥) represents the function implemented by the layers within

the TCN block, 𝑔(∙) is an activation function ReLU and 𝑜 is the output of the TCN block.

A strong forecasting model requires each output entry to depend on all previous input entries. This

implies a receptive field equal to the input length. Conventional convolutional layers, with a specified kernel

size, limit the dependency to a subset of input entries. To ensure a broader influence and capture long-term

dependencies, especially when stacking layers, it's crucial to extend the receptive field for each output entry

to encompass the entire input sequence. More generally, a 1D convolutional network with 𝑛 layers and a

kernel size 𝑘 has a receptive field 𝑟 of size,

𝑟 = 1 + 𝑛 ∗ (𝑘 − 1) (7)

To calculate the number of layers 𝑛, set the size of the receptive field to the input length l and then

solve for 𝑛.

𝑛 = [(𝑙 − 1)/(𝑘 − 1)] (8)

This means that, given a fixed kernel size, the number of layers required is linear in the length of the

input tensor, which will result in networks that become very deep very fast, leading to models with a very
large number of parameters that take longer to train. Furthermore, a high number of layers has been shown

to lead to degradation problems related to the gradient of the loss function. One way to increase the receptive

field size while still keeping the number of layers relatively small is to introduce dilation to the convolutional

network.

Likewise, every additional layer adds a value of 𝑑 ∗ (𝑘 − 1) to the current receptive field width,

where 𝑑 is computed as 𝑑 = 𝑏 ∗∗ 𝑖, with 𝑖 representing the number of layers below our new layer.

Consequently, the width of the receptive field 𝑤 of a TCN with exponential dilation of base 𝑏, kernel size 𝑘,

and the number of layers 𝑛 is given by,

𝑛−1
𝑏𝑛 − 1
𝑤 = 1 + ∑ (𝑘 − 1) ∙ 𝑏 𝑘 = 1 + (𝑘 − 1) ∙ (9)
𝑏−1
𝑖=0

Fig.3. Temporal Convolutional Networks (TCN) model with 3 stacked blocks. Each block has 3

convolutional layers with kernel size 2 and dilations [1, 2, 4]. Source [33]

All these characteristics make TCNs a very suitable deep learning architecture for complex time series

problems. The main advantage of TCNs is that, similarly to RNNs, they can handle variable-length inputs

by sliding the one-dimensional causal convolutional kernel. Furthermore, TCNs are more memory efficient

than recurrent networks due to the shared convolution architecture which allows them to process long

sequences in parallel. In RNNs, the input sequences are processed sequentially, which results in higher
computation time. Moreover, TCNs are trained with the standard backpropagation algorithm, hence avoiding

the gradient problems of the backpropagation-through-time algorithm (BPTT) used in RNN [39].

3.3 Model development

This section outlines the steps involved in building the hybrid proposed model. The block diagram

illustrates a hybrid EMD and TCN technique for predicting power consumption. The process begins with the

input data, which consists of historical power consumption data. This data is then subjected to EMD

decomposition to decompose the data into IMFs and a residual component. The decomposition process

allows for the separation of the signal into multiple IMFs and one residue, capturing different frequency

components of the data. Next, the IMFs and the residual component undergo data preprocessing. This

involves normalizing and reshaping the data to ensure it is in a suitable format for further processing.

Normalization scales the data to a standard range while reshaping adjusts the data structure to match the

requirements of the subsequent processing steps.

Each IMF and the residue are then processed individually using a TCN. The TCN is a deep learning

architecture specifically designed for sequence modeling tasks, making it ideal for time series data such as

power consumption. The TCN layers apply convolutional operations to capture temporal dependencies in the

data, extracting relevant features from each IMF and the residue. After processing through the TCN layers,

the feature maps for each IMF and the residue are flattened. Flattening converts the multi-dimensional feature

maps into one-dimensional vectors, facilitating their combination in the next step. These flattened features

are then concatenated to form a comprehensive feature vector that includes temporal features from all IMFs

and the residue. This concatenated feature vector is then passed through a fully connected layer, also known

as a dense layer. Fig. 4. Depicts the framework of the proposed hybrid model.
Fig.4. Framework of hybrid EMD-TCN model

The fully connected layer transforms the features by learning complex relationships between them,

ultimately enhancing the predictive capability of the model. The output of the fully connected layer is used

to generate power forecasts. These forecasts represent the predicted power consumption values based on the

input historical data. The final output block displays the predicted power values, providing valuable insights

for power consumption prediction.

This hybrid approach leverages the strengths of both EMD and TCN, combining the signal

decomposition capability of EMD with the sequence modeling prowess of TCN. By decomposing the data

into IMFs and processing each component separately, the model can effectively capture both high-frequency

and low-frequency patterns in the data. The TCN ensures that temporal dependencies are accurately modeled,

leading to more precise power consumption predictions.

3.4 Evaluation metrics

The root mean squared error (RMSE), the mean square error (MSE), the mean absolute error (MAE),

the mean absolute percentage error (MAPE), and the mean absolute error (MBE) were chosen as model

evaluation metrics. They are defined as follows:


𝑁
1 2
𝑅𝑀𝑆𝐸 = √ ∑(𝑦𝑘 − 𝑦
̂)
𝑘 (10)
𝑁
𝑘=1

𝑁
1 2
𝑀𝑆𝐸 = ∑(𝑦𝑘 − 𝑦
̂)
𝑘 (11)
𝑁
𝑘=1

𝑁
1
̂𝑘 |2
𝑀𝐴𝐸 = ∑ |𝑦𝑘 − 𝑦 (12)
𝑁
𝑘=1

𝑁
1 𝑦𝑘 − 𝑦̂𝑘
𝑀𝐴𝑃𝐸 = ∑ | | × 100% (13)
𝑁 𝑦𝑘
𝑘=1

𝑁
1
𝑀𝐵𝐸 = ∑(𝑦𝑘 − 𝑦
̂)
𝑘 (14)
𝑁
𝑘=1

where 𝑦𝑘 is the 𝑘 𝑡ℎ sample value in 𝑦, 𝑦


̂𝑘 is the 𝑘 𝑡ℎ forecasted value and 𝑁 is the total number of

samples.

4. Experiments and result analysis

The Smart meter data collected as a time series from the UMassTraceRepository [40] is processed and

aggregated into hourly average load data. Two individual residences 1 and 2 are chosen for the experiments.

Each house is modeled independently. The number of measurements is not uniform across all two homes.

The duration of data collected from home2 is found shorter than 1. To maintain consistency, the average

hourly consumption of residences 1, 2, and 3 for one month is considered. Hence a total of (31 x 24) 644

data points for each house is used in this research for hourly load prediction. Among these, 80% of data is

used for training, and the remaining 20 % is used for testing. The load forecasting is carried out using the

proposed hybrid EMD-TCN algorithm and EMD with the traditional DL algorithms LSTM and GRU. Fig.

5-6 shows the prediction result in graphical form for different forecasting models. The error values of the
proposed EMD-TCN method are generally lower than those of the other hybrid deep learning network

architectures.

(a)

(b)

Fig.5. Model performance with (a) TCN (b) EMD-TCN for Home1

(a)
(b)

Fig.6. Model performance with (a) TCN (b) EMD-TCN for Home2

(a)

(b)

Fig.7. Model performance with (a) LSTM (b) EMD-LSTM for Home1
(a)

(b)

Fig.8. Model performance with (a) LSTM (b) EMD-LSTM for Home2

(a)
(b)

Fig.9. Model performance with (a) GRU (b) EMD-GRU for Home1

(a)

(b)

Fig.10. Model performance with (a) GRU (b) EMD-GRU for Home2
Fig.11-16 depicts the power forecasting results using various deep learning models. The graphs show

the power consumption (in kW) over time (in hours) and illustrate the performance of various DL models

that have been used to forecast power consumption data. The blue line represents the original power

consumption data, while the red line represents the forecasted data by the model. The original data (blue line)

displays significant variability with pronounced peaks at regular intervals, indicating periods of high power

consumption. The forecasted data (red line) aligns closely with the original data towards the later part of the

time series, indicating that the LSTM model effectively captures the overall trend and seasonality of the

power consumption data. Among these, fig. 11 (b) and fig. 12 (b) the EMD-TCN model appears to capture

the timing and magnitude of the peaks reasonably well, which is crucial for applications that rely on peak

demand forecasting for grid stability and energy management.

(a) (b)

Fig.11. Power consumption forecasting results using (a) TCN (b) EMD-TCN model for Home1
(a) (b)

Fig.12. Power consumption forecasting results using (a) TCN (b) EMD-TCN model for Home2

(a) (b)

Fig.13. Power consumption forecasting results using (a) LSTM (b) EMD-LSTM model for Home1
(a) (b)

Fig.14. Power consumption forecasting results using (a) LSTM (b) EMD-LSTM model for Home2

(a) (b)

Fig.15. Power consumption forecasting results using (a) GRU (b) EMD-GRU model for Home1
(a) (b)

Fig.16. Power consumption forecasting results using (a) GRU (b) EMD-GRU model for Home2

The table.1. presents the error values of Home1 for various forecasting models, including TCN, EMD-

TCN, LSTM, EMD-LSTM, GRU, and EMD-GRU, evaluating their performance on power consumption

data. The error metrics include RMSE, MSE, MAE, MAPE, and MBE. The hybrid models (EMD-TCN,

EMD-LSTM, and EMD-GRU) generally show improved performance over their non-hybrid counterparts

(TCN, LSTM, and GRU) across all metrics. Specifically, EMD-TCN achieves the lowest error rates in RMSE

(0.069), MSE (0.004), MAE (0.009), MAPE (0.032), and MBE (0.001), indicating its superior accuracy and

reliability for power consumption forecasting. EMD-LSTM also shows significant improvements compared

to the standard LSTM model, with lower errors in all metrics. Similarly, EMD-GRU outperforms the GRU

model, particularly in MAPE and MBE. These results highlight the effectiveness of incorporating EMD into

forecasting models, significantly enhancing their accuracy and robustness. Similarly, Table 2. shows the error

for Home2. Here also, EMD-TCN achieves the lowest error rates. Hence, the hybrid EMD and TCN model

combines the strengths of both techniques to enhance the accuracy and robustness of time series forecasting.
Table. 1. Model performance statistics for Home1

Error TCN EMD-TCN LSTM EMD-LSTM GRU EMD-GRU


metrics
RMSE 0.171 0.069 0.0.17308 0.140 00.320 0.178
MSE 0.036 0.004 0.036 0.019 0.200 0.038
MAE 0.121 0.009 0.125 0.016 0.285 0.125
MAPE 0.047 0.012 0.126 0.044 1.208 0.054
MBE 0.008 0.001 0.053 0.021 0.151 0.049

Table. 2. Model performance statistics for Home2

Error TCN EMD-TCN LSTM EMD-LSTM GRU EMD-GRU


metrics
RMSE 0.456 0.300 0.738 0.0.52108 01.627 00.720
MSE 0.905 0.249 0.538 0.236 1.992 0.864
MAE 0.164 0.026 0.030 0.365 0.952 0.426
MAPE 0.238 0.024 0.130 0.258 9.456 5.682
MBE 0.300 0.015 0.125 0.368 0.926 0.548

Table. 3. Training and testing time for various forecasting model

Model Training time (s) Testing time (s)

TCN 20 0.1008
LSTM 35 21
GRU 40 25
EMD-TCN 53 27
EMD-LSTM 59 28
EMD-GRU 65 30

The integration of hybrid methods like EMD-TCN is anticipated to increase computational complexity.

To evaluate this, simulations were conducted on the Home1 dataset using an Intel Core i5 processor with 64
GB RAM capacity in Matlab R2023a. The training and testing times for both direct methods and hybrid

EMD methods are summarized in Table.3. The results indicate that hybrid EMD methods generally require

more time for training compared to their direct counterparts. However, EMD-TCN is significantly more

efficient, being 6 times faster in training than EMD-LSTM. In terms of testing time, EMD methods perform

comparably to their direct method equivalents. Notably, the testing time for EMD-TCN is considerably lower

than that for other EMD methods. This efficiency can be attributed to the inherent speed of TCN over basic

single-layer neural networks.

5. Conclusion

Recent advancements in power sector infrastructure have highlighted the importance of accurate

residential load forecasting. The unpredictable consumption patterns of residential customers often lead to

random peaks in load signals, complicating accurate forecasting. Experimental evaluations with three state-

of-the-art DL models have demonstrated poor forecast accuracy during these peak periods. To address this,

a novel hybrid method combining EMD with TCN is proposed. This approach specifically targets peak

modeling to enhance forecast accuracy. A high-resolution Smart meter dataset from two residential customers

was used to evaluate the proposed algorithm. Experimental results indicate that the EMD-TCN hybrid

method outperforms the other tested models, particularly in scenarios with frequent peaks in the residential

load signal. However, the method's performance diminishes in the absence of peaks, suggesting that future

enhancements could involve an ensemble of models. This research is timely and significant, as accurate peak

predictions impact various aspects of the energy sector, including demand response, battery control, electric

vehicle charge scheduling, and real-time electricity pricing. Accurate peak forecasts can also help utilities

manage schedulable loads more effectively.

References

[1] Khuntia SR, Rueda JL, van der Meijden MAMM. Forecasting the load of electrical power systems in mid- and long-term
horizons: a review. IET Gener Transm Distrib 2016;10(16):3971–7. http://dx.doi.org/10.1049/iet-gtd.2016.0340.
[2] Jacob M, Neves C, Vukadinović Greetham D. Forecasting and assessing risk of individual electricity peaks. Cham: Springer
International Publishing; 2020, p. 15–37. http://dx.doi.org/10.1007/978-3-030-28669-9_2.
[3] Singh RP, Gao PX, Lizotte DJ. On hourly home peak load prediction. In: 2012 IEEE third international conference on smart
grid communications. 2012, p. 163–8. http://dx.doi.org/10.1109/SmartGridComm.2012.6485977.
[4] Smart*. Umass repository onine. 2012, URL http://traces.cs.umass.edu/index.php/Smart/Smart.
[5] Singh, P.; Dwivedi, P. Integration of new evolutionary approach with artificial neural network for solving
short term load forecast problem. Appl. Energy 2018, 217, 537–549.
[6] Haida, T. Regression based peak load forecasting using a transformation technique. IEEE Trans. Power Syst.
1994, 9, 1788–1794.
[7] Khashei, M.; Bijari, M. A novel hybridization of artificial neural networks and ARIMA models for time series
forecasting. Appl. Soft Comput. J. 2011, 11, 2664–2675. Energies 2019, 12, 1140 17 of 18
[8] Holt, C. Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 2004,
20, 5–10.
[9] Srivastava, A.K.; Pandey, A.S.; Singh, D. Short-term load forecasting methods: A review. In Proceedings of
the International Conference on Emerging Trends in Electrical Electronics & Sustainable Energy Systems
(ICETEESES), Sultanpur, India, 11–12 March 2016; pp. 130–138.
[10] Wang, Y.; Niu, D.; Ji, L. Short-term power load forecasting based on IVL-BP neural network technology. Syst. Eng. Procedia
2012, 4, 168–174.
[11] Fu, Y.; Li, Z.; Zhang, H.; Xu, P. Using Support Vector Machine to Predict Next Day Electricity Load of Public Buildings
with Sub-metering Devices. Procedia Eng. 2015, 121, 1016–1022.
[12] Lahouar, A.; Ben, H.S.J. Day-ahead load forecast using random forest and expert input selection. Energy Convers. Manag.
2015, 103, 1040–1051.
[13] Tong, C.; Li, J.; Lang, C. An efficient deep model for day-ahead electricity load forecasting with stacked denoising auto-
encoders. J. Parallel Distrib. Comput. 2018, 117, 267–273.
[14] Heigold, G.; Vanhoucke, V. Multilingual acoustic models using distributed deep neural networks. In Proceedings of the IEEE
International Conference on Acoustics, Speech and Signal Processing, Vancouver,
BC, Canada, 26–31 May 2013; pp. 8619–8623.
[15] He, K.; Zhang, X.; Ren, S. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA , 26 June–1 July 2016;
pp. 770–778.
[16] Mocanu, E.; Nguyen, P.H.; Gibescu, M.; Kling, W.L. Deep learning for estimating building energy consumption. Sustain.
Energy Grids Netw. 2016, 6, 91–99.
[17] Kong, W.; Dong, Z.Y.; Jia, Y. Short-Term Residential Load Forecasting based on LSTM Recurrent Neural
Network. IEEE Trans. Smart Grid 2017, 10, 49–53.
[18] Rahman, A.; Srikumar, V.; Smith, A.D. Predicting electricity consumption for commercial and residential
buildings using deep recurrent neural networks. Appl. Energy 2018, 201, 372–385.
[19] Mohamed, M. Parsimonious Memory Unit for Recurrent Neural Networks with Application to Natural
Language Processing. Neurocomputing 2018, 314, 48–64.
[20] Qiu, X.; Ren, Y.; Suganthan, P.N. Empirical Mode Decomposition based ensemble deep learning for load
demand time series forecasting. Appl. Soft Comput. 2017, 54, 246–255.
[21] Rana, M.; Koprinska, I. Forecasting electricity load with advanced wavelet neural networks. Neurocomputing
2016, 182, 118–132.
[22] Huang, N.E.; Shen, Z.; Long, S.R. The Empirical Mode Decomposition and the Hilbert Spectrum for
Nonlinear and Non-Stationary Time Series Analysis. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 1998,
454, 903–995.
[23] Fan, G.F.; Peng, L.; Hong, W.C. Electric load forecasting by the SVR model with differential empirical mode
decomposition and auto regression. Neurocomputing 2016, 173, 958–970.
[24] Bedi, J.; Toshniwal, D. Empirical Mode Decomposition Based Deep Learning for Electricity Demand
Forecasting. IEEE Access 2018, 6, 49144–49156.
[25] Imani M, Ghassemian H. Residential load forecasting using wavelet and collaborative representation transforms. Appl Energy
2019;253:113505. http://dx.doi.org/10.1016/j.apenergy.2019.113505.
[26] Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y. Short-term residential load forecasting based on LSTM recurrent neural
network. IEEE Trans Smart Grid 2019;10(1):841–51.
[27] Ouyang K, Liang Y, Liu Y, Tong Z, Ruan S, Rosenblum D, et al. Fine-grained urban flow inference. IEEE Trans Knowl Data
Eng 2020;1. http://dx.doi.org/10.1109/TKDE.2020.3017104.
[28] Liu Y, Liang Y, Ouyang K, Liu S, Rosenblum D, Zheng Y. Predicting urban water quality with ubiquitous data - A data-
driven approach. IEEE Trans Big Data 2020;1. http://dx.doi.org/10.1109/TBDATA.2020.2972564.
[29] Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: Concepts, methodologies, and applications. ACM Trans Intell
Syst Technol 2014;5(3).http://dx.doi.org/10.1145/2629592.
[30] Barker S, Mishra A, Irwin D, Cecchet E, Shenoy P, Albrecht J. Smart* an open data set and tools for enabling research in
sustainble homes. In: Proceedings of the 2012 workshop on data mining applications in sustainability. SustKDD ’12, ACM; 2012.
[31] Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, et al. The empirical mode decomposition and the Hilbert spectrum
for nonlinear and non-stationary time series analysis. Proc R Soc Lond Ser A 1998;454(1971):903–98.
http://dx.doi.org/10.1098/rspa.1998.0193.
[32] Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence
Modeling. arXiv, arXiv:1803.01271, 2018.
[33] Pedro Lara-Benitez, Manuel Carranza-García , Jose M. Luna-Romera and Jose C. Riquelme, “Temporal Convolutional
Networks Applied to Energy-Related Time Series Forecasting”, MDPI, applied sciences, 2020.
[34] Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. In Proceedings of the 4th International
Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2016.
[35] Lara-Benítez, P.; Carranza-García, M.; García-Gutiérrez, J.; Riquelme, J. Asynchronous dual-pipeline deep learning
framework for online data stream classification. Integr. Comput.-Aided Eng. 2020.
[36] Nair, V.; Hinton, G. Rectified linear units improve Restricted Boltzmann machines. In Proceedings of the ICML 2010—27th
International Conference on Machine Learning, Haifa, Israel, pp. 807–814, 2010.
[37] Van den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.W.;
Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv, arXiv:1609.03499, 2016.
[38] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 770–778, 2016.
[39] Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training Recurrent Neural Networks. arXiv, arXiv:cs.LG/1211.5063,
2012.
[40] Smart*. Umass repository onine. 2012, URL http://traces.cs.umass.edu/index.php/Smart/Smart.

You might also like