Enhanced Short
Enhanced Short
Abstract
In recent years, the residential load forecasting problem has been gaining renewed interest due to the advent
of smart meters and data analytics. These advancements have provided granular data and sophisticated tools
for analysis, making it possible to account for many factors that affect short-term electric load. The
superposition of these factors leads to the load being non-linear and non-stationary, presenting a complex
challenge that requires innovative forecasting approaches to accurately predict electricity consumption.
Separating different load components from the original load series can help to improve the accuracy of
prediction, but the direct modeling and predicting of the decomposed time series components will give rise
to multiple random errors and increase the workload of prediction. For performance comparison, two state-
of-the-art deep learning methods, namely gated recurrent unit (GRU) and long short-term memory (LSTM),
are selected. Results indicate that the proposed method effectively captures the peaks typically present in
residential loads, thereby enhancing forecast accuracy. Additionally, the performance of EMD-based models
improves when the test data exhibits more peaks. The evaluation is conducted using Smart*, a public dataset
Keywords: short-term load forecasting, smart meter data, data analytics, empirical mode decomposition,
1. Introduction
People live in the age of data analytics, where all decisions are data-driven. With unlimited
opportunities laid by data analytics applications, industries are now looking for ways to incorporate data
analytics into their operational strategies. The power sector is not an exception, looking for new strategies to
improve its operations and control. Fueling this intention further is the recent advancements in Automated
Metering Infrastructure (AMI) and the wide deployments of smart meters to residential customers
worldwide. Smart meters measure and communicate electrical consumption data from customer premises to
the energy provider at a stated interval stipulated by the Utility. The measurement data accumulated and sent
to the Utility over a period draws the attention of Data scientists to extract useful insight for the betterment
of both energy suppliers and energy consumers. For example, a Utility can attract customers to participate in
a demand response (DR) program by offering appropriate incentives to reduce their energy bills. An accurate
estimate of the energy bill in advance could help the customers’ budgeting. Statistical analysis and visual
presentation of energy consumption will enable the customers to understand their consumption behavior and
adjust their energy usage to reduce costs or contribute to DR programs. Load prediction is the key driving
factor of all these services. Any error in the prediction will lead to a significant loss for the Utility and its
Load prediction is a forecast of future energy demand based on time series of past recordings of
energy consumption. Depending upon the forecast duration, load prediction is classified into short-term,
medium-term, and long-term [1]. A forecast duration ranging from a few minutes to a week in the future is
categorized as short-term. The work presented in this paper investigates the hourly forecast of residential
load and hence falls under the short-term category. The prediction of residential load will be robust if
forecasting is carried out for a short range of time in the future because residential loads are characterized by
a sudden rise and fall in electricity demand leading to the occurrence of peaks in a pattern. These peaks are
irregular, and their time of occurrence cannot be determined apriori because customers’ consumption
behavior is uncertain. This poses a challenge in modeling peaks as there is only a minimal dependency of
present consumption on its past data. Thus, the peaks have to be modeled precisely within a short time. Hence
This paper experimented with three state-of-the-art deep learning (DL) methods, namely GRU,
LSTM, and TCN, to model the residential load signal. It is found from the results that the predictions at peaks
are not accurate. Authors of [3] also conclude that DL algorithms poorly predict peaks for hourly mean load
prediction. Hence, a novel hybrid method is proposed based on EMD with TCN to improve the short-term
forecast accuracy of residential loads. To the best of the knowledge, no research has attempted to use the
EMD in residential load prediction. The forecast covers four residential customers namely Home1 and
Home2, whose Smart meter measurement data is accessible publicly from the Smart* dataset available in the
UMass Trace Repository online [4]. Two sets of experiments are conducted for each DL algorithm, one using
the decomposed data by EMD and another without using EMD, namely original data.
The rest of the paper is organized into discussions on load forecasting problems (related works) in
section 2. Various steps and stages involved in data preparation and the model development process are
presented in section 3. A detailed discussion of experiments and the analysis of the results are reported in
2. Related works
As the electric load series is non-linear, unstable, and relatively random, many models and methods
have certain limitations in short-term load forecasting [5]. Since the mid-20th century, various statistical-
based linear time series prediction methods have been proposed, and these methods generally need a precise
mathematical model to present the relationship between load and input factors. Haida [6] proposed a
regression-based daily peak load forecasting method and conversion technique. Khashei [7] predicted the
hourly load changes by establishing an autoregressive integrated moving average model (ARIMA). Holt [8]
used an exponentially weighted moving average prediction model to predict non-seasonal and seasonal series
with additive or multiplicative error structures. However, the forecasting model based on the statistical
method is relatively simple and requires high stability of load series, which cannot accurately reflect the non-
linear characteristics of load data. With the development of artificial intelligence technology, machine
learning methods, such as artificial neural networks (ANN), support vector machine (SVM), and random
forest (RF), have been widely used in the field of short-term load forecasting [9]. Reference [10] proposed a
short-term load forecasting method based on an improved variable learning rate backpropagation (BP) neural
network, and the experiment results show the method has high accuracy and real-time performance.
Fu Y [11] used the SVM to predict the hourly electricity load of a building and achieved good results.
In Reference [12], RF was used to predict the load for the next 24 hours, and the predicted performance of
the model was analyzed in detail. Although the machine learning method performs better in the nonlinear
relationship of the load series and has achieved good results in the field of load forecasting, there are still
some defects. Load series is a complex time series, and the machine learning method has poor processing
ability for timing features and requires manual filtrating of the timing features [13]. The flourishing
development of deep learning provides researchers with new ways to solve this problem. The DL method
mainly refers to the deep neural network which contains multiple hidden layers and has a specific structure
and training method. It has been widely used in many fields, such as speech recognition [14] and image
processing [15]. At present, it has also been discussed in the field of electricity load forecasting. Mocanu [16]
used the deep belief network composed of conditional restriction Boltzmann machine (CRBM) to predict the
load of a single residential building. Compared with the shallow artificial neural network and support vector
machine, the results improved a lot. Reference [17] established a predictive model based on the LSTM neural
Aowabin Rahman [18] used recurrent neural network (RNN) to predict hourly consumption of a
safety building in Utah and residential buildings in Texas, and the results have lower relative errors compared
to multilayer perceptron networks. Due to the limited learning ability of deep belief network for time series
features, RNN have been heatedly discussed in short-term load forecasting for their unique structure.
However, RNN has been proven to have the problems of gradient explosion and disappearance. Based on the
RNN, the GRU network solves the problem of gradient explosion and disappearance of RNN by adding the
gate structure to control the influence of the previous time [19], so that it can better process the time series.
In recent years, various combination models have been introduced to improve the accuracy of short-term
electricity load forecasting. Among them, the combination of signal decomposition method and machine
learning method has been widely studied [20]. Rana [21] used a wavelet neural network to decompose the
load series into sub-series with different frequencies, and then established a prediction model for each sub-
series, and obtained more accurate prediction results. However, it is necessary to choose the wavelet basis
function manually for the wavelet transform. EMD is another method of signal decomposition. Instead of
setting the basis function in advance, it can decompose the signal according to the characteristics of the data
itself, and the basis function is directly generated from the signal itself in the process of decomposition [22].
Each sub-series contains only part characteristic of the original load series, which makes it much simpler
than the original load series, so that more accurate prediction results can be obtained, and the EMD method
Guo [23] used the SVR and autoregression (AR) models to predict the high frequency and residual
components decomposed by EMD, respectively. Jatin [24] combined the EMD method with the LSTM model
to forecast the load demand for a given season and date and obtained better results than the single prediction
model. The hybrid models mentioned above are mainly different in the decomposition algorithm or the
prediction model, but the established process is almost the same. Most of the earlier works discussed here
had focused on load forecasting at the aggregate level like country, state, or city. They had little or no access
to fine-grained meter readings from residential customers, mainly due to device restrictions. Earlier research
works used reading measured by humans, usually at monthly or bi-monthly intervals. In recent times,
residential load forecasting has gained attention due to the arrival of smart meters and the availability of fine-
grained measurements from residential customers [25,26]. Smart meter data also contributes to one of the
Urban Flows, and the load forecasting using this data will add value in the context of Urban Computing, an
This paper uses smart meter data collected at a frequency of 1 Hz from two residential buildings [30].
This research proposes a hybrid method using EMD in tandem with TCN to improve residential load forecast
prediction accuracy. EMD is a technique proposed by N. Huang decomposes a time series of non-linear and
non-stationary nature into several stationary time series called intrinsic mode functions (IMFs) and one
residue [31]. Since the residential load signal is non-linear and non-stationary, EMD can transform the
residential load signal into many IMFs and one residue. Then any ML algorithm can be used to model the
individual IMFs and the residue. The sum of predicted values obtained from the individual models will
produce the final forecast result. In this work, EMD with TCN, LSTM, and GRU are experimented with to
1) By using EMD to decompose the time series into IMFs and residue, the model captures both high-
2) The TCN component leverages dilated convolutions and hierarchical temporal blocks to model
long-range dependencies in the data, addressing the limitations of traditional recurrent neural
networks (RNNs) and LSTMs in capturing temporal dynamics over extended periods.
lower error metrics root mean square error (RMSE), mean square error (MSE), and mean absolute
percentage error (MAPE) compared to standalone models. This accuracy is crucial for
Empirical mode decomposition (EMD) is an adaptive signal processing technique developed by N.E.
Huang and his colleagues in the late 1990s. It is designed to analyze non-linear and non-stationary time series
data. Its ability to handle non-linear and non-stationary data makes it a powerful tool for modern signal
processing challenges. Unlike traditional methods such as Fourier and wavelet transform, which rely on
predefined basis functions, EMD operates directly on the data to decompose it into a set of Intrinsic Mode
Functions (IMFs). These IMFs are simple oscillatory modes, each capturing a specific frequency component
of the original signal. The primary advantage of EMD is its data-driven approach, which makes it particularly
effective for handling complex signals with time-varying characteristics. The workflow of the EMD process
is depicted in Fig. 1. The flowchart represents the steps involved in the EMD process to decompose a time
series power consumption data 𝑥(𝑡) into IMFs and a residue. The procedure begins with inputting the time
series data 𝑥(𝑡), which represents historical power consumption data. The first step involves identifying all
local maxima and minima points in the time series data. These points are then used to create upper 𝑢(𝑡) and
lower 𝑙(𝑡) envelopes through cubic spline interpolation, ensuring that the data's peaks and troughs are
accurately captured. Next, the mean envelope 𝑚(𝑡) is computed by averaging the upper and lower envelopes.
𝑢(𝑡) + 𝑙(𝑡)
𝑚 (𝑡 ) = (1)
2
The difference between the original time series data and the mean envelope denoted as 𝑑 (𝑡), is then
𝑑 ( 𝑡 ) = 𝑚( 𝑡 ) − 𝑥 ( 𝑡 ) (2)
The process checks if 𝑑 (𝑡) satisfies the conditions of an IMF, which requires that the number of zero-
crossings and extrema must either equal or differ at most by one and that the mean value of the envelope
must be zero. If 𝑑 (𝑡) qualifies as an IMF, it is extracted, and the residue, 𝑟(𝑡), is computed.
The algorithm then assesses whether the residue is monotonic, meaning it has no more oscillatory
components that can be decomposed further. If the residue is not monotonic, the process repeats with the
residue as the new input time series data. This iterative process continues until all IMFs are extracted and the
This systematic approach of decomposing the time series data into IMFs and a residue is crucial for
power forecasting, as it allows for the separation of different frequency components, making it easier to
This section presents a brief introduction to the computational technique TCN used in energy
consumption forecasts. TCNs are a type of convolutional neural network with a specific design that makes
them suitable for handling time series. TCNs satisfy two main principles: the network’s output has the same
length as the input sequence (similarly to LSTM networks), and they prevent leakage of information from
the future to the past by using causal convolutions [32]. Causal convolution is different from standard
convolution because the convolution operation performed to get the output at t does not accept future values
as inputs. This implies that, using a kernel size 𝑘, the output 𝑂𝑡 is obtained using the values of
𝑋𝑡−(𝑘−1) , 𝑋𝑡−(𝑘−2) , … . 𝑋𝑡−1 , 𝑋𝑡 . Zero-padding of length 𝑘 − 1 is used at every layer to maintain the same
length as the input sequence. This ensures consistency in the length of the input and output sequences across
Fig.2. Differences between (a) standard convolutional network, (b) causal convolutional network,
use one-dimensional dilated convolutions. In the context of convolutional operations, dilation refers to
skipping specific values between the inputs, effectively increasing the receptive field of the network without
resorting to pooling operations. This approach prevents the loss of resolution, a common issue with
traditional pooling methods [34]. Dilated convolution involves inserting gaps between the values in the
convolutional kernel. As illustrated in Fig.2 (c), dilation skips "d" values between the inputs of the
convolutional operation. This spatial configuration allows the network to capture a broader context of the
input sequence. The complete dilated causal convolution operation over consecutive layers can be formulated
as follows [35]:
𝐾−1
(𝑡−(𝑘×𝑑))
𝑥𝑙𝑡 = 𝑔 (∑ 𝑤𝑙𝑘 𝑥(𝑙−1) + 𝑏𝑙 ) (4)
𝑘=0
where 𝑥𝑙𝑡 is the output of the neuron at position (𝑡) in the 𝑙-th layer; 𝐾 is the width of the convolutional
kernel; 𝑤𝑙𝑘 stands for the weight of position (𝑘); 𝑑 is the dilation factor of the convolution; and 𝑏𝑙 is the bias
term. Rectified linear units (ReLU) layers are used as activation functions [36], mathematically expressed
as,
𝑔(𝑥 ) = max(0, 𝑥) (5 )
In simple terms, 𝑔(𝑥)represents the output of the ReLU activation function for a given input 𝑥,
max(0, 𝑥) is the part of the equation that takes the maximum value between 0 and the input 𝑥. So, the ReLU
function essentially activates a neuron if the input is positive, allowing information to pass through, and
deactivates it (sets the output to 0) if the input is negative. This helps introduce non-linearity to the neural
network, enabling it to learn complex patterns and relationships in the data. ReLU is popular due to its
Another common approach to further increase the network’s receptive field is to concatenate several
TCN blocks, as can be seen in Fig.3 [37]. However, this leads to deeper architecture with a significant
increase in parameters, which complicates the learning procedure. For this reason, a residual connection is
added to the output of each TCN block. Residual connections were proposed by [38] to improve performance
in very deep architectures, and consist of adding the input of a TCN block to its output
Where 𝑥 is the input to the TCN block, 𝐹(𝑥) represents the function implemented by the layers within
the TCN block, 𝑔(∙) is an activation function ReLU and 𝑜 is the output of the TCN block.
A strong forecasting model requires each output entry to depend on all previous input entries. This
implies a receptive field equal to the input length. Conventional convolutional layers, with a specified kernel
size, limit the dependency to a subset of input entries. To ensure a broader influence and capture long-term
dependencies, especially when stacking layers, it's crucial to extend the receptive field for each output entry
to encompass the entire input sequence. More generally, a 1D convolutional network with 𝑛 layers and a
𝑟 = 1 + 𝑛 ∗ (𝑘 − 1) (7)
To calculate the number of layers 𝑛, set the size of the receptive field to the input length l and then
solve for 𝑛.
This means that, given a fixed kernel size, the number of layers required is linear in the length of the
input tensor, which will result in networks that become very deep very fast, leading to models with a very
large number of parameters that take longer to train. Furthermore, a high number of layers has been shown
to lead to degradation problems related to the gradient of the loss function. One way to increase the receptive
field size while still keeping the number of layers relatively small is to introduce dilation to the convolutional
network.
Likewise, every additional layer adds a value of 𝑑 ∗ (𝑘 − 1) to the current receptive field width,
where 𝑑 is computed as 𝑑 = 𝑏 ∗∗ 𝑖, with 𝑖 representing the number of layers below our new layer.
Consequently, the width of the receptive field 𝑤 of a TCN with exponential dilation of base 𝑏, kernel size 𝑘,
𝑛−1
𝑏𝑛 − 1
𝑤 = 1 + ∑ (𝑘 − 1) ∙ 𝑏 𝑘 = 1 + (𝑘 − 1) ∙ (9)
𝑏−1
𝑖=0
Fig.3. Temporal Convolutional Networks (TCN) model with 3 stacked blocks. Each block has 3
convolutional layers with kernel size 2 and dilations [1, 2, 4]. Source [33]
All these characteristics make TCNs a very suitable deep learning architecture for complex time series
problems. The main advantage of TCNs is that, similarly to RNNs, they can handle variable-length inputs
by sliding the one-dimensional causal convolutional kernel. Furthermore, TCNs are more memory efficient
than recurrent networks due to the shared convolution architecture which allows them to process long
sequences in parallel. In RNNs, the input sequences are processed sequentially, which results in higher
computation time. Moreover, TCNs are trained with the standard backpropagation algorithm, hence avoiding
the gradient problems of the backpropagation-through-time algorithm (BPTT) used in RNN [39].
This section outlines the steps involved in building the hybrid proposed model. The block diagram
illustrates a hybrid EMD and TCN technique for predicting power consumption. The process begins with the
input data, which consists of historical power consumption data. This data is then subjected to EMD
decomposition to decompose the data into IMFs and a residual component. The decomposition process
allows for the separation of the signal into multiple IMFs and one residue, capturing different frequency
components of the data. Next, the IMFs and the residual component undergo data preprocessing. This
involves normalizing and reshaping the data to ensure it is in a suitable format for further processing.
Normalization scales the data to a standard range while reshaping adjusts the data structure to match the
Each IMF and the residue are then processed individually using a TCN. The TCN is a deep learning
architecture specifically designed for sequence modeling tasks, making it ideal for time series data such as
power consumption. The TCN layers apply convolutional operations to capture temporal dependencies in the
data, extracting relevant features from each IMF and the residue. After processing through the TCN layers,
the feature maps for each IMF and the residue are flattened. Flattening converts the multi-dimensional feature
maps into one-dimensional vectors, facilitating their combination in the next step. These flattened features
are then concatenated to form a comprehensive feature vector that includes temporal features from all IMFs
and the residue. This concatenated feature vector is then passed through a fully connected layer, also known
as a dense layer. Fig. 4. Depicts the framework of the proposed hybrid model.
Fig.4. Framework of hybrid EMD-TCN model
The fully connected layer transforms the features by learning complex relationships between them,
ultimately enhancing the predictive capability of the model. The output of the fully connected layer is used
to generate power forecasts. These forecasts represent the predicted power consumption values based on the
input historical data. The final output block displays the predicted power values, providing valuable insights
This hybrid approach leverages the strengths of both EMD and TCN, combining the signal
decomposition capability of EMD with the sequence modeling prowess of TCN. By decomposing the data
into IMFs and processing each component separately, the model can effectively capture both high-frequency
and low-frequency patterns in the data. The TCN ensures that temporal dependencies are accurately modeled,
The root mean squared error (RMSE), the mean square error (MSE), the mean absolute error (MAE),
the mean absolute percentage error (MAPE), and the mean absolute error (MBE) were chosen as model
𝑁
1 2
𝑀𝑆𝐸 = ∑(𝑦𝑘 − 𝑦
̂)
𝑘 (11)
𝑁
𝑘=1
𝑁
1
̂𝑘 |2
𝑀𝐴𝐸 = ∑ |𝑦𝑘 − 𝑦 (12)
𝑁
𝑘=1
𝑁
1 𝑦𝑘 − 𝑦̂𝑘
𝑀𝐴𝑃𝐸 = ∑ | | × 100% (13)
𝑁 𝑦𝑘
𝑘=1
𝑁
1
𝑀𝐵𝐸 = ∑(𝑦𝑘 − 𝑦
̂)
𝑘 (14)
𝑁
𝑘=1
samples.
The Smart meter data collected as a time series from the UMassTraceRepository [40] is processed and
aggregated into hourly average load data. Two individual residences 1 and 2 are chosen for the experiments.
Each house is modeled independently. The number of measurements is not uniform across all two homes.
The duration of data collected from home2 is found shorter than 1. To maintain consistency, the average
hourly consumption of residences 1, 2, and 3 for one month is considered. Hence a total of (31 x 24) 644
data points for each house is used in this research for hourly load prediction. Among these, 80% of data is
used for training, and the remaining 20 % is used for testing. The load forecasting is carried out using the
proposed hybrid EMD-TCN algorithm and EMD with the traditional DL algorithms LSTM and GRU. Fig.
5-6 shows the prediction result in graphical form for different forecasting models. The error values of the
proposed EMD-TCN method are generally lower than those of the other hybrid deep learning network
architectures.
(a)
(b)
Fig.5. Model performance with (a) TCN (b) EMD-TCN for Home1
(a)
(b)
Fig.6. Model performance with (a) TCN (b) EMD-TCN for Home2
(a)
(b)
Fig.7. Model performance with (a) LSTM (b) EMD-LSTM for Home1
(a)
(b)
Fig.8. Model performance with (a) LSTM (b) EMD-LSTM for Home2
(a)
(b)
Fig.9. Model performance with (a) GRU (b) EMD-GRU for Home1
(a)
(b)
Fig.10. Model performance with (a) GRU (b) EMD-GRU for Home2
Fig.11-16 depicts the power forecasting results using various deep learning models. The graphs show
the power consumption (in kW) over time (in hours) and illustrate the performance of various DL models
that have been used to forecast power consumption data. The blue line represents the original power
consumption data, while the red line represents the forecasted data by the model. The original data (blue line)
displays significant variability with pronounced peaks at regular intervals, indicating periods of high power
consumption. The forecasted data (red line) aligns closely with the original data towards the later part of the
time series, indicating that the LSTM model effectively captures the overall trend and seasonality of the
power consumption data. Among these, fig. 11 (b) and fig. 12 (b) the EMD-TCN model appears to capture
the timing and magnitude of the peaks reasonably well, which is crucial for applications that rely on peak
(a) (b)
Fig.11. Power consumption forecasting results using (a) TCN (b) EMD-TCN model for Home1
(a) (b)
Fig.12. Power consumption forecasting results using (a) TCN (b) EMD-TCN model for Home2
(a) (b)
Fig.13. Power consumption forecasting results using (a) LSTM (b) EMD-LSTM model for Home1
(a) (b)
Fig.14. Power consumption forecasting results using (a) LSTM (b) EMD-LSTM model for Home2
(a) (b)
Fig.15. Power consumption forecasting results using (a) GRU (b) EMD-GRU model for Home1
(a) (b)
Fig.16. Power consumption forecasting results using (a) GRU (b) EMD-GRU model for Home2
The table.1. presents the error values of Home1 for various forecasting models, including TCN, EMD-
TCN, LSTM, EMD-LSTM, GRU, and EMD-GRU, evaluating their performance on power consumption
data. The error metrics include RMSE, MSE, MAE, MAPE, and MBE. The hybrid models (EMD-TCN,
EMD-LSTM, and EMD-GRU) generally show improved performance over their non-hybrid counterparts
(TCN, LSTM, and GRU) across all metrics. Specifically, EMD-TCN achieves the lowest error rates in RMSE
(0.069), MSE (0.004), MAE (0.009), MAPE (0.032), and MBE (0.001), indicating its superior accuracy and
reliability for power consumption forecasting. EMD-LSTM also shows significant improvements compared
to the standard LSTM model, with lower errors in all metrics. Similarly, EMD-GRU outperforms the GRU
model, particularly in MAPE and MBE. These results highlight the effectiveness of incorporating EMD into
forecasting models, significantly enhancing their accuracy and robustness. Similarly, Table 2. shows the error
for Home2. Here also, EMD-TCN achieves the lowest error rates. Hence, the hybrid EMD and TCN model
combines the strengths of both techniques to enhance the accuracy and robustness of time series forecasting.
Table. 1. Model performance statistics for Home1
TCN 20 0.1008
LSTM 35 21
GRU 40 25
EMD-TCN 53 27
EMD-LSTM 59 28
EMD-GRU 65 30
The integration of hybrid methods like EMD-TCN is anticipated to increase computational complexity.
To evaluate this, simulations were conducted on the Home1 dataset using an Intel Core i5 processor with 64
GB RAM capacity in Matlab R2023a. The training and testing times for both direct methods and hybrid
EMD methods are summarized in Table.3. The results indicate that hybrid EMD methods generally require
more time for training compared to their direct counterparts. However, EMD-TCN is significantly more
efficient, being 6 times faster in training than EMD-LSTM. In terms of testing time, EMD methods perform
comparably to their direct method equivalents. Notably, the testing time for EMD-TCN is considerably lower
than that for other EMD methods. This efficiency can be attributed to the inherent speed of TCN over basic
5. Conclusion
Recent advancements in power sector infrastructure have highlighted the importance of accurate
residential load forecasting. The unpredictable consumption patterns of residential customers often lead to
random peaks in load signals, complicating accurate forecasting. Experimental evaluations with three state-
of-the-art DL models have demonstrated poor forecast accuracy during these peak periods. To address this,
a novel hybrid method combining EMD with TCN is proposed. This approach specifically targets peak
modeling to enhance forecast accuracy. A high-resolution Smart meter dataset from two residential customers
was used to evaluate the proposed algorithm. Experimental results indicate that the EMD-TCN hybrid
method outperforms the other tested models, particularly in scenarios with frequent peaks in the residential
load signal. However, the method's performance diminishes in the absence of peaks, suggesting that future
enhancements could involve an ensemble of models. This research is timely and significant, as accurate peak
predictions impact various aspects of the energy sector, including demand response, battery control, electric
vehicle charge scheduling, and real-time electricity pricing. Accurate peak forecasts can also help utilities
References
[1] Khuntia SR, Rueda JL, van der Meijden MAMM. Forecasting the load of electrical power systems in mid- and long-term
horizons: a review. IET Gener Transm Distrib 2016;10(16):3971–7. http://dx.doi.org/10.1049/iet-gtd.2016.0340.
[2] Jacob M, Neves C, Vukadinović Greetham D. Forecasting and assessing risk of individual electricity peaks. Cham: Springer
International Publishing; 2020, p. 15–37. http://dx.doi.org/10.1007/978-3-030-28669-9_2.
[3] Singh RP, Gao PX, Lizotte DJ. On hourly home peak load prediction. In: 2012 IEEE third international conference on smart
grid communications. 2012, p. 163–8. http://dx.doi.org/10.1109/SmartGridComm.2012.6485977.
[4] Smart*. Umass repository onine. 2012, URL http://traces.cs.umass.edu/index.php/Smart/Smart.
[5] Singh, P.; Dwivedi, P. Integration of new evolutionary approach with artificial neural network for solving
short term load forecast problem. Appl. Energy 2018, 217, 537–549.
[6] Haida, T. Regression based peak load forecasting using a transformation technique. IEEE Trans. Power Syst.
1994, 9, 1788–1794.
[7] Khashei, M.; Bijari, M. A novel hybridization of artificial neural networks and ARIMA models for time series
forecasting. Appl. Soft Comput. J. 2011, 11, 2664–2675. Energies 2019, 12, 1140 17 of 18
[8] Holt, C. Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 2004,
20, 5–10.
[9] Srivastava, A.K.; Pandey, A.S.; Singh, D. Short-term load forecasting methods: A review. In Proceedings of
the International Conference on Emerging Trends in Electrical Electronics & Sustainable Energy Systems
(ICETEESES), Sultanpur, India, 11–12 March 2016; pp. 130–138.
[10] Wang, Y.; Niu, D.; Ji, L. Short-term power load forecasting based on IVL-BP neural network technology. Syst. Eng. Procedia
2012, 4, 168–174.
[11] Fu, Y.; Li, Z.; Zhang, H.; Xu, P. Using Support Vector Machine to Predict Next Day Electricity Load of Public Buildings
with Sub-metering Devices. Procedia Eng. 2015, 121, 1016–1022.
[12] Lahouar, A.; Ben, H.S.J. Day-ahead load forecast using random forest and expert input selection. Energy Convers. Manag.
2015, 103, 1040–1051.
[13] Tong, C.; Li, J.; Lang, C. An efficient deep model for day-ahead electricity load forecasting with stacked denoising auto-
encoders. J. Parallel Distrib. Comput. 2018, 117, 267–273.
[14] Heigold, G.; Vanhoucke, V. Multilingual acoustic models using distributed deep neural networks. In Proceedings of the IEEE
International Conference on Acoustics, Speech and Signal Processing, Vancouver,
BC, Canada, 26–31 May 2013; pp. 8619–8623.
[15] He, K.; Zhang, X.; Ren, S. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA , 26 June–1 July 2016;
pp. 770–778.
[16] Mocanu, E.; Nguyen, P.H.; Gibescu, M.; Kling, W.L. Deep learning for estimating building energy consumption. Sustain.
Energy Grids Netw. 2016, 6, 91–99.
[17] Kong, W.; Dong, Z.Y.; Jia, Y. Short-Term Residential Load Forecasting based on LSTM Recurrent Neural
Network. IEEE Trans. Smart Grid 2017, 10, 49–53.
[18] Rahman, A.; Srikumar, V.; Smith, A.D. Predicting electricity consumption for commercial and residential
buildings using deep recurrent neural networks. Appl. Energy 2018, 201, 372–385.
[19] Mohamed, M. Parsimonious Memory Unit for Recurrent Neural Networks with Application to Natural
Language Processing. Neurocomputing 2018, 314, 48–64.
[20] Qiu, X.; Ren, Y.; Suganthan, P.N. Empirical Mode Decomposition based ensemble deep learning for load
demand time series forecasting. Appl. Soft Comput. 2017, 54, 246–255.
[21] Rana, M.; Koprinska, I. Forecasting electricity load with advanced wavelet neural networks. Neurocomputing
2016, 182, 118–132.
[22] Huang, N.E.; Shen, Z.; Long, S.R. The Empirical Mode Decomposition and the Hilbert Spectrum for
Nonlinear and Non-Stationary Time Series Analysis. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 1998,
454, 903–995.
[23] Fan, G.F.; Peng, L.; Hong, W.C. Electric load forecasting by the SVR model with differential empirical mode
decomposition and auto regression. Neurocomputing 2016, 173, 958–970.
[24] Bedi, J.; Toshniwal, D. Empirical Mode Decomposition Based Deep Learning for Electricity Demand
Forecasting. IEEE Access 2018, 6, 49144–49156.
[25] Imani M, Ghassemian H. Residential load forecasting using wavelet and collaborative representation transforms. Appl Energy
2019;253:113505. http://dx.doi.org/10.1016/j.apenergy.2019.113505.
[26] Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y. Short-term residential load forecasting based on LSTM recurrent neural
network. IEEE Trans Smart Grid 2019;10(1):841–51.
[27] Ouyang K, Liang Y, Liu Y, Tong Z, Ruan S, Rosenblum D, et al. Fine-grained urban flow inference. IEEE Trans Knowl Data
Eng 2020;1. http://dx.doi.org/10.1109/TKDE.2020.3017104.
[28] Liu Y, Liang Y, Ouyang K, Liu S, Rosenblum D, Zheng Y. Predicting urban water quality with ubiquitous data - A data-
driven approach. IEEE Trans Big Data 2020;1. http://dx.doi.org/10.1109/TBDATA.2020.2972564.
[29] Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: Concepts, methodologies, and applications. ACM Trans Intell
Syst Technol 2014;5(3).http://dx.doi.org/10.1145/2629592.
[30] Barker S, Mishra A, Irwin D, Cecchet E, Shenoy P, Albrecht J. Smart* an open data set and tools for enabling research in
sustainble homes. In: Proceedings of the 2012 workshop on data mining applications in sustainability. SustKDD ’12, ACM; 2012.
[31] Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, et al. The empirical mode decomposition and the Hilbert spectrum
for nonlinear and non-stationary time series analysis. Proc R Soc Lond Ser A 1998;454(1971):903–98.
http://dx.doi.org/10.1098/rspa.1998.0193.
[32] Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence
Modeling. arXiv, arXiv:1803.01271, 2018.
[33] Pedro Lara-Benitez, Manuel Carranza-García , Jose M. Luna-Romera and Jose C. Riquelme, “Temporal Convolutional
Networks Applied to Energy-Related Time Series Forecasting”, MDPI, applied sciences, 2020.
[34] Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. In Proceedings of the 4th International
Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2016.
[35] Lara-Benítez, P.; Carranza-García, M.; García-Gutiérrez, J.; Riquelme, J. Asynchronous dual-pipeline deep learning
framework for online data stream classification. Integr. Comput.-Aided Eng. 2020.
[36] Nair, V.; Hinton, G. Rectified linear units improve Restricted Boltzmann machines. In Proceedings of the ICML 2010—27th
International Conference on Machine Learning, Haifa, Israel, pp. 807–814, 2010.
[37] Van den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.W.;
Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv, arXiv:1609.03499, 2016.
[38] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp. 770–778, 2016.
[39] Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training Recurrent Neural Networks. arXiv, arXiv:cs.LG/1211.5063,
2012.
[40] Smart*. Umass repository onine. 2012, URL http://traces.cs.umass.edu/index.php/Smart/Smart.