Big Data Analytics in Smart Grids
Big Data Analytics in Smart Grids
1. Abstract:
The current power system can benefit greatly from the use of massive volumes of data from
the electricity network, meteorological information system, geographic information system,
etc. in the age of big data. Additionally, this will enhance societal welfare and consumer
service. To develop the application of big data analytics in actual smart grids, however,
additional challenges must be overcome. These challenges relate to methodology, awareness,
synergies, etc.
2. Introduction:
As digital technology and cloud computing rapidly evolve, an increasing amount of data is
generated by digital devices and sensors, including smart phones, computers, cutting-edge
measuring infrastructures, etc., as well as through human activity and communications. For
instance, exabytes (1018) and zettabytes (1021) are now used to describe the quantity of data
on the internet (Emani et al., 2015). The rational, effective, and efficient examination of these
data has a significant positive impact on both our personal and professional lives.
Yet, the amount of data being gathered is increasing exponentially, and their structure is also
getting much more intricate.
The potential data that may be gathered with smart grid infrastructure's improved metering is
first explored. The paper then quickly examines the fundamentals of data analytics as well as
some of the most common methods. The article concludes by demonstrating in-depth data
analytics applications in smart grid.
Fig.1
4. Data analysis techniques:
Data analysis is the most crucial stage of the big data processing system because it serves as
the foundation for finding vital information and assisting in decision-making (Fan et al.,
2018; Cheng et al., 2018).
From a broad perspective, data analytics, also known as data mining, is a computational
process that uses tools like database, statistics, pattern recognition, machine learning, etc. to
identify probable relationships between variables. However, because of the variety of
sources, the data sets gathered may vary in quality in terms of noise, redundancy, and
consistency.
4.1. Data preprocessing:
As Fig. 3 shows, the use of data pre-processing techniques is required to increase data
quality. The goal of data integration techniques is to effectively combine data from several
sources to create a single view (Roya et al., 2018). For instance, the attribute "date time"
would appear twice if the datasets for weather condition records and power system
interruption occurrences were combined. Yet, it appears that the following data analytics
procedure only requires one characteristic of "date time."
Fig. 2
A logarithm can be used to "correct" the distribution form of data with high skewness
because some data analytics methods are sensitive to imbalanced data. If the initial dataset
only contains the highest and minimum temperature values, other attributes like the
temperature difference can be computed in the pre-processing step. The newly created
features often aid in increasing the accuracy of data analytics findings.
4.2. Data analytics techniques:
According to whether each item in datasets has a label, the most popular data mining or
machine learning methods are typically characterized as supervised or unsupervised learning,
as illustrated in Table 5. The data analytics model can be trained using the provided data for
the supervised learning algorithms to determine the relationship between data attributes and
the related categories or values. The data analytics model is typically created to identify
potential categories among all the items, while for those without labels (Di Zhua & Zhang,
2018).
6. Conclusion:
Big data in smart grids and the accompanying cutting-edge analysis techniques have been
studied and addressed in this article. Smart meters deployed in the power system, the
electrical market, GIS, meteorological information systems, social media, and other sources
are used to gather data that may contain important information. Advanced ICT technology in
the power system links the traditional physical characteristics of the power system to external
variables in order to discover upcoming regulations and scientific difficulties. Smart grids can
benefit from the eleven data analytics applications described in the paper, including
operations, maintenance, load forecasting, protection, fault detection, and fault location.
Given that the use of data analytics in smart grids is a broad and complex topic that involves
ICT technologies, electrical engineering, computer science, and other disciplines, it requires
collaboration among specialists in various domains as well as strategic visions for the best
designs.
7. References:
1. Ghosh D, Ghose T, Mohanta DK (Aug. 2013) Communication feasibility analysis for
smart grid with phasor measurement units. IEEE Trans. Ind. Informat. 9(3):1486–
1496
2. Gillis JM, Alshareef SM, Morsi WG (2016) Nonintrusive load monitoring using
wavelet design and machine learning. IEEE Transactions on Smart Grid 7(1):320–328
3. Ak R, Fink O, Zio E (2016) Two machine learning approaches for short-term wind
speed time-series prediction. IEEE Transactions on Neural Networks and Learning
Systems 27(8):1734–1747
4. Bauman K, Tuzhilin A, Zaczynski R (2017) Using social sensors for detecting
emergency events: a case of power outages in the electrical utility industry. ACM
Transactions on Management Information Systems 8(2–3)
5. Granell R, Axon CJ, Wallom DCH (Nov. 2015) Impact of raw data temporal
resolution using selected clustering methods on residential electricity load profiles.
IEEE Trans Power Syst 30(6):3217–3224
6. Shvachko, K., Kuang, H., Radia, S., et al.: ‘The Hadoop distributed file system’. Proc.
IEEE 26th Symp. on Mass Storage Systems and Technologies (MSST), 2010, pp. 1–
10
7. Fang, B., Yin, X., Tan, Y., et al.: ‘The contributions of cloud technologies to smart
grid’, Renew. Sustain. Energy Rev., 2016, 59, pp. 1326–1331