0% found this document useful (0 votes)
21 views5 pages

Hesenliu 2016

Uploaded by

Teddy Iswahyudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views5 pages

Hesenliu 2016

Uploaded by

Teddy Iswahyudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

The Design and Implementation of the Enterprise

Level Data Platform and Big Data Driven


Applications and Analytics
Hesen Liu1, Jiahui Guo1, Student Member, IEEE, Tao Xia, Senior Member, IEEE,
Wenpeng Yu1, Lin Zhu1, Member, IEEE, Rui Sun, Member, IEEE,
Yilu Liu1,2, Fellow, IEEE R. Matthew Gardner, Senior Member, IEEE
1
The University of Tennessee, Knoxville, TN Dominion Virginia Power, Richmond, VA
2
Oak Ridge National Laboratory {tao.xia, rui.sun, matthew.gardner}@dom.com
{hliu24, jguo7, wyu10, lzhu12, liu}@utk.edu

Abstract— In order to improve the capability of utilizing big management of electric utilities. Dominion Virginia Power
data and business intelligence in the power industry, this paper (DVP), which is one of the nation’s largest producers and
presents a comprehensive solution through building an transporters of electrical energy, has also suffered these
enterprise-level data platform based on the OSIsoft PI system to
support big data driven applications and analytics. The platform problems for a long time.
has the features of scalability, real time, service-oriented For the above challenges, academics have suggested some
architecture and high reliability. Compared to traditional potential solutions [1], [2]. Moreover, some European
platforms in the power industry, the significant benefit of the enterprises have been applying analytic strategies of big data
innovative platform is that end users can use the data with the to enhance customer management and operational capability.
global model to drive the self-customized services rather than In addition, IT commercial giants, such as IBM, also have
depend on IT professionals to deploy the service. The paper also
describes how to implement data integration, global model proposed the enterprise level integration solutions based on
construction and big data driven analytics, which are difficult to cloud computing in [3]. However, the majority of them are
achieve with traditional solutions. Meanwhile, the paper exhibits conceptual and not easy to practice. In [4], electric utilities in
preliminary visualization results through data analysis in real China have been developing and integrated information
scenarios. platforms and have received preliminary achievement. But
the platforms are mainly for applications and services in
Index Terms-- asset framework (AF), big data, common
information model (CIM), data driven application, data control centers rather than the entire enterprise.
integration, data platform, global model construction, time Since the existing solutions are not appropriate for large
series database, visualization. electric utilities like DVP, the optimal approach, which is
exhibited in the paper, is to implement a robust data platform
for improving the capability of big data management and
I. INTRODUCTION analytics. In order to manage data and models flexibly and

W ITH the expansion of the scale of power systems and provide applications easily, the innovative platform may have
widely used advanced information technologies, the four features: 1) Scalability: Though the number and the
quantities and categories of data and information at various variety of data are increasing continuously, the platform may
resolutions are increasing dramatically. New phenomena and handle a growing amount of work and to be enlarged to
issues in power systems need to be recognized and analyzed. accommodate the growth. 2) Real time: Huge amounts of
Therefore, data mining and analytics are critical for the power high resolution real time data are integrated in the platform.
industry. However, several ubiquitous challenges are many Therefore, the platform with real time database uses real-time
information silos without cross system integration, the lack of processing to handle workloads and enable access to
global data description and data models and even insufficient historical and current data. 3) Service-oriented: The
common applications and services. They impede efficient implementation of encapsulating services for hiding the
data mining, waste valuable information and obstruct trivial details is critical for users. In the platform, the
advanced data analytics in electric utilities under the big data numerous adapters and interfaces of third party applications
environment. Meanwhile, these challenges also have a or systems become a common service in the platform through
negative effect on the ever-growing business and delicate standard protocols. For end users, the platform offers flexible

978-1-5090-2157-4/16/$31.00 ©2016 IEEE


and lightweight tools for data query, visualization, and
analytics. 4) High reliability: The platform will be a core part
of IT systems within the enterprise and provide services for
different departments through the enterprise network. Given
that the system can consist of a large number of hardware
components, partial failures are unavoidable. Therefore, the
platform design is to use cluster servers with load balancers
to handle partial failures gracefully without causing service
interruption.
The remaining of this paper is organized as follows.
Section II describes the methodology of data integration. The
design of the hierarchical data model is presented in Section Figure 1. Types of Big data for the Enterprise-level Data Platform
III. In Section IV, the application and visualization for the
B. Implementation of Big Data Integration
data platform are introduced. Section V concludes this paper.
Big data integration (BDI) is fundamental and critical to
II. THE METHODOLOGY OF DATA INTEGRATION implement the vision of big data in terms of modeling,
application and analytics. The value of data can be exhibited
With the rapid growth of both structured and unstructured by data mining only when it is possible for disparate data to
data from multiple sources, the current IT infrastructure needs link and seamlessly interweave with other data to derive a
to be reorganized to optimize the flow of big data for unified and global representation. In [6], the author
fulfilling the intensive analytic applications. The mentioned that BDI is different from traditional data
implementation is to utilize the OSIsoft PI system [5] to build integration in several dimensions. In Fig. 2, the IT
a highly reliable and flexible common data repository. infrastructure of big data integration is presented.
A. Types of Big Data
Big data sets for the enterprise-level data platform are
depicted in Fig.1. The majority of real time data still depend
on the Supervisory Control and Data Acquisition-Energy
Management System (SCADA-EMS) since the deployment
of remote terminal units is widely practiced and has provided
the operators the ability of monitoring the operation status of
the entire system. Meanwhile, the historical data from
SCADA-EMS contain abundant raw information for
situational awareness and system planning. On the other
hand, with the increasing number of phasor measurement Figure 2. High level architecture of big data integration
units (PMUs), high resolution PMU data can provide more
adequate dynamic responses and instantaneous values with Meanwhile, several core requirements such as scalability,
accurate timestamps. In the distribution network, with the real time, service-orient service and high reliability are
introduction of intelligent distribution automation equipment needed to be fulfilled. In order to tackle the challenges, one
and distributed generation into the grid, the need to monitor, practical approach is provided below: 1) Integrate the data
analyze, optimize and control the distribution system in real through unified interface services: Unified interface services
time is greater than ever, and the data from the Distribution support connecting the platform to disparate data sources.
Management System (DMS) play an important role to fulfill Some interfaces enable history recovery, and some simply
the above requirements. For protection technicians, access the history stored in third-party historians. These data
comprehensive information from Digital Fault Record (DFR), sources are seamlessly interwoven into the platform
relay settings and circuit calculation are critical for detecting independent of source, protocol or vendor. Interfaces service
and analyzing faults. In addition, traditional planning mainly enable buffer to multiple servers, intelligent data reduction,
focuses on the off-line limitation calculation, the design of single PI tag definition as well as point by point security.
the substation and network topology. If it is easier to involve Redundancy and auto point sync are also available on
more statistical information from the data platform, the
interface service. 2) Archive into the data collective layer and
planning decision should be smarter than ever. Moreover,
build data driven model: Data is instantly stored in the
many electric utilities maintain and collect the auxiliary
archive servers of the platform and made available to users in
information and substation information such as asset
information, weather information, field test data, and so on. real-time. Meanwhile, the hierarchical data modeling is able
Such information is able to exert a greater contribution for to create a consistent representation of assets or processes and
management and operation in power utilities while it is associate data in proper context, providing information
integrated with data from other sources. related to the data itself. The model may provide the easiest

978-1-5090-2157-4/16/$31.00 ©2016 IEEE


way for users to find the information they need. 3) Provide
the applications and services to end users: In order to reduce
Thus, a new class called “FunctionLocation (FL)”, derived
barriers to use data and models, the platform provides the
from EquipmentContainer (EC), is defined in Fig. 3. An
popular tools such as Microsoft Office Excel and mobile
association is created between FL and EC as well, by which a
phone for end users to make them work on the data easier and
FL may have sub-level ECs.
implement analytics rather than waste time on data collection.
4) Backup strategy: The platform has redundant design and B. AF Implementation Based on Hierarchical structure
two groups of systems with same structure serve backup to The AF implementation consists of two steps: creating the
the other. It can guarantee the reliability during the operation. global AF template on the ontology layer and forming the
association of data for creating the hierarchical tree.
III. HIERARCHICAL DATA MODELING IN DATA PLATFORM For the first step, CIM profile, which is a subset model of
CIM, needs to be created for AF implementation since not all
A. Common Information Model and its Extension the packages or classes of CIM are needed. Then an AF
Hierarchical structure is used in PI Asset Framework (AF) adaptor is developed to create AF templates according to the
to restore and manage data which are mentioned in Section II. CIM profile based on AF SDK, as shown in Fig.4.
In order to represent the global data model in AF, the feasible For the second step, links between data objects need to be
approach is to utilize the hierarchical structure of the established. There are two types of links between objects in
Common Information Model (CIM) in AF with customized AF at the data level: Parent-Child and Reference.
extension. It may guarantee the data integration and Extension CIMTool AF SDK
interoperation from different systems.
Extended CIM
CIM defines a common vocabulary and basic ontology for CIM CIM Profile AF
the aspects of power industries. Various CIM packages UML UML UML Templates
describe the basic classes and attributes for network, energy AF adaptor
management, metering, and outage management and so on. Figure 4. Flowchart of generating the AF templates based on CIM
However, since CIM would not contain all classes and
attributes of a specified application, CIM needs to be The ways of creating links in AF based on CIM are listed
customized and extended based on business requirements. below: 1) Aggregation between CIM classes is converted to a
For example, self-contained equipment containers, which are Parent-Child link between objects of AF. Fig. 5 exhibits the
extensible and have a flexible structure, are used to build the aggregation between EC and Equipment. As subclasses of
hierarchical structure to manage the data at DVP. However, EC, Substation and VL are aggregate of equipment, and
the standard CIM cannot serve the demand of building the Substation consists of VLs. In AF, equipment, such as buses,
hierarchical structure in order to maintain the styles and breakers, are children of VL, while VL are children of
preferences of the data model at DVP. The existing types of Substation, as shown in the right portion of Fig.5. Association
equipment containers, such as Substation, VoltageLevel between CIM classes is converted to a Parent-Child link or a
(VL), and Bay, which are depicted in Fig. 3, are not self- Reference between elements of AF.
contained and cannot satisfy the requirement of building a
hierarchical structure.

Figure 5. Aggregation between Equipment and EquipmentContainer

Here is an example to exhibit how to create links.


ConnectivityNode (CN), ConductingEquipment (CE) and a
Terminal are used in CIM to describe the topology of the
network, as shown in Fig. 6. Considering most of CEs have
two Terminals with the exception of BusbarSection and
TransformerWinding which have only one Terminal, the
association between CE and the Terminal is converted to
Parent-Child link, while the association between CN and
Figure 3. Chart for the example of CIM extension
Terminal is converted to reference. As shown in the right

978-1-5090-2157-4/16/$31.00 ©2016 IEEE


portion of Fig. 6, Terminals are children of Buses, Breakers 3. Delivering Business Intelligence
or other CEs, each of Terminal has a reference link to an CN, The ultimate goal of this data platform and data driven
and CNs have several reference links to Terminals. analytics is to benefit end users, including system operators,
Once a hierarchical structure is created in AF, application circuit analysts, maintenance technicians, etc. With a
and visualization can be built based on AF and achieved data. common data repository, all of the end users could embrace
cleaner, easier and smarter data, even build self-customized
services and visualizations rather than completely depend on
IT professionals to deploy the services. Meanwhile, since the
data are provided by the global model, it is feasible to
enhance asset management, improve situational awareness,
reduce forced outage, extend equipment life, etc. The features
of the robust data platform are that the common data
repository can provide the data service with high security and
end users can manipulate the data without corrupting and
tampering the raw data.
B. Preliminary Applications and Visualizations
1. Assets Conditional-Based Maintenance (CBM)
Figure 6. Associations between CN, Terminal and CE To improve asset performance, reliability and lifecycle
management of asset, it is critical to change the maintenance
IV. DATA DRIVEN ANALYTICS, APPLICATIONS AND strategy from previous calendar-based to conditional-based,
VISUALIZATIONS in order to reduce the unnecessary maintenance cost. The
platform enables asset monitoring and analytics, and with
A. Infrastructure of Data Analytics predefined thresholds or conditions, it can generate
1. Data Preprocessing notification, email alert and work order automatically.
After collecting data from various data sources, the Among assets management in a utility, power transformers
following steps are applied for data preprocessing. constitute one of the largest investments, therefore it is a high
 Exception: Utilizing a simple deadband algorithm, the priority to have an effective diagnostic tool for condition
exception step uses some statistical values to decide assessment. Dissolved gas analysis (DGA) of insulating oil is
whether to report a certain value to the archive, so as considered the single best indicator of a transformer’s overall
to reduce network traffic and storage memory. health condition, with real-time monitoring data streaming,
 Compression: This step aims at ensuring that the online DGA monitoring is built to visualize the trend of 8
archive stores just enough data to accurately reproduce critical gas concentrations for each transformer, as shown in
the original signal, which is not only intended to save Fig. 7.
storage space, but to effectively remove noise and may
improve system performance for end users without
scarifying potential values of data.
 Outlier Screening: Due to unexpected sensors
malfunction or communication errors, some outliers
are mixed with original signals, this step ensures good
data quality feeding to the data archive.
2. Data Mining and Pattern Discovery
With structural historical and real-time data integrated in
the common data repository, a unified data source interface is
provided. Moreover, the platform provides different levels of
analytics tools, which vary from simple calculation using
built-in math and statistical libraries, to high-level Figure 7. DGA visualization dashboard for transformer TX3
programming language development interfaces, such as C#,
2. Comprehensive Diagram View of EMS
Visual Basic and Matlab. Leveraging these powerful tools,
massive data mining is used to find patterns from historical As client software of the system, PI ProcessBook could
events as empirical information to support future predictive import existing EMS information and automatically sketch
asset management, to provide insights from typical scenarios oneline diagram, as shown in Fig.8. All the measurements,
for decision-making, etc. including voltages magnitude, current magnitude, active and
reactive power of each line, switch and breaker status are

978-1-5090-2157-4/16/$31.00 ©2016 IEEE


updated using real-time measurements feeding by the 4. Extensible Visualization Platform
platform as an intermediate interface. With this common information data model and provided
uniform interfaces, it enables the 3rd party software to make
communicate to the data repository and retrieve information
to achieve innovative visualization. As one example, CIMSpy
[7], which is a CIM-based exploratory tool, could be
leveraged to visualize the power exchange between utilities
as shown in Fig.11.

Figure 8. EMS oneline diagram in PI ProcessBook

3. Wide Area Monitoring in Transmission Systems


In a transmission system, frequency and voltage are key
indicators of operational condition. The integration of
synchrophasor data in this platform enables monitoring the
frequency, voltage, power information in real-time, as shown
in Fig. 9, and building situational awareness applications,
such as islanding detection, generation trip, event detection,
etc., so as to aid system operation. Also, the utilization of Figure 11. CIMSpy visualization for power exchanges
these high resolution data could benefit post-event analysis.
Fig.10 shows the frequency contour map during an event. V. CONCLUSION
Through the ongoing project at DVP, this paper presents
the entire solution for integrating, managing and analyzing
big data in the power industry. The ultimate target is to build
a highly reliable and flexible common data platform with the
features of scalability, real time, high reliability and security
within the enterprise. It is foreseeable for DVP for the
business intelligence and management capability to be
improved and reinforced significantly in the future.
Meanwhile, it is convenient and efficient for individuals to
initialize self-customized applications to discover and analyze
the issues with fewer restrictions under permissions.
Furthermore, this paper may provide a good instance for
peers who intend to enhance the capability of managing and
Figure 9. Frequency and voltage magnitude snapshot utilizing big data so as to improve business intelligence.
REFERENCES
[1] A. Bose, “Smart Transmission Grid Applications and Their Supporting
Infrastructure,” Smart Grid, IEEE Transactions, vol.1, pp. 11-19, Apr.
2010
[2] M. Kezunovic, Le Xie and S. Grijalva, “The role of big data in
improving power system operation and protection”, presented at Bulk
Power System Dynamics and Control - IX Optimization, Security and
Control of the Emerging Power Grid (IREP), 2013 IREP Symposium.
2013. Rethymno.
[3] IBM Software, “Managing big data for smart grids and smart,” May 20
[4] J. Liu, X. Li, D. Liu, H. Liu and P. Mao, “Study on Data Management
of Fundamental Model in Control Center for Smart Grid Operation,”
Smart Grid, IEEE Transactions, vol.2, pp. 573-579, Aug. 2011.
[5] OSIsoft [Online], Available: http://www.osisoft.com/
[6] X. Dong and D. Srivastava "Big data integration", in Proc 2013. Data
Engineering (ICDE), 2013 IEEE 29th International Conference,
pp.1245 -1248
[7] CIMSpy EE [Online], Available: http://www.powerinfo.us/CIMSpy
Figure 10. Frequency contour map EE. html

978-1-5090-2157-4/16/$31.00 ©2016 IEEE

You might also like