EKG Events
EKG Events
Introduction:
Delivering large-scale real-time AI applications across highly distributed data sets is currently
impractical. In order for enterprises to gain real-time holistic knowledge of customers, patients,
companies or other important entities – at massive scale - a new data model that leverages an
Entity-Event Knowledge Graph approach is necessary.
Let’s take a step back and consider the main focus of any given business or organization. In
evaluating the business focus of most enterprises, there is usually one absolute core entity and
perhaps one less critical secondary entity, around which all the business revolves. For example,
a hospital’s core entity is the patient and a secondary entity would be providers. Banks have the
customer as its core entity and a B2B call center is focused on the companies they are
contacting to uncover sales opportunities. In nearly all cases, this core entity is found scattered
in the numerous siloed databases that have been developed by different groups, departments,
and divisions due to ever changing needs.
The challenge all these enterprises fail to solve is that information about the core entity is
hidden in literally thousands of separate and disconnected relational databases. Relatively few
enterprises can provide a comprehensive 360 overview of their core entity and those that are
close have invested in scores of IT teams to build yet another bespoke data mart on top of silos,
enterprise data warehouses (EDWs), and data lakes in order to provide the answers and reports
necessary for a successful business. Even worse is the new trend of quickly building new apps
on top of the underlying silos, as this cements the underlying data infrastructure. No one dares
to touch the data in case one of the hundreds or thousands of apps will break. For a great
description of the situation being created by this approach, Dave McComb’s book Software
Wasteland offers insight and solutions that should be considered by data-architects.
Thought leading architects in big enterprises have always dreamt of a data centric organization
that has one semantic and perfectly unified representation of all of its core entities. This then
would drive all reporting, analytics, and applications. As a result of this dream these enterprises
spent hundreds of billions of dollars in the past decades on data integration, master data
management (MDM), and data lakes. These efforts have proven of limited success because
they fail to approach data from the entity viewpoint.
Recent market trends show the approach with momentum to solve this problem is to build
“Knowledge Graphs.” Gartner for the last two years has identified the validity of this technology
approach by offering reports and increasing the Knowledge Graph’s visibility in their hype cycle
research. One of the primary reasons for this increased visibility is the fact that nearly every
Silicon Valley giant and most Fortune 500 companies have begun investing in Knowledge
Graphs for advanced analytics and AI.
This white paper provides a summary of the main characteristics of Franz Inc.’s approach to
Entity-Event Knowledge Graphs.
As the title of this section states, core entities and its events are organized in a hierarchical tree
that consists of two levels and an associated knowledge base (“KB”). In the following we will use
some examples that come from production based Entity-Event Knowledge Graph (“EEKG”)
implementations in the domains of healthcare and telecommunications.
Level one of every tree is the core entity and it serves as the root of the tree. In a hospital this
entity is the patient and in a telco it is a customer. Each of these objects would have the obvious
and expected demographic attributes.
Level two of the tree are the events and sub-events. In a hospital, events are hospital visits with
sub-events such as diagnostics, tests, medication orders, procedures, vital signs, etc. In a telco
events are phone-calls, sms messages, calls to the customer care department, bill payments,
phone activations, and the like.
Note that these event objects are very regular and just what you might expect: they always have
a main type and a start and end-time and then a few other properties that distinguish them from
other events. The first thing we want to stress is that this is where the unification and
simplification happens. Everything that happens to a patient, everything that a telco customer
does, and every financial transaction for a customer of a bank has roughly the same simple
event structure. The second thing we want to stress that this approach makes the EEKG design
uniquely future proof: if you decide to add yet another type of event you don't have to make any
changes to earlier data, you just define a new event type and you can start adding new events
of that type.
The associated KBs are not part of the tree but they are crucial to the entire operation of the
EEKG. The KBs contain all the domain dependent data needed to make the data in the tree
meaningful. For healthcare a knowledge base contains ontologies, taxonomies and thesauri for
drugs, diseases, procedures, clinical pathways, and everything else important for medicine (like
the facts that dropsy and congestive heart failure are two names for the same disease and
paracetamol and acetaminophen are two names for the same drug.) For a telco the KBs would
contain information about devices, location data, GIS data, etc.
Here are some very simplified examples for healthcare and a telco. Note that in Figure 1 we see
that the core Entity, on the left, is a patient, and Events appear in the middle. Even demographic
data are events with a start time and an end-time given that people change addresses,
telephone numbers, and sometimes even gender. The main events in the figure are encounters
with sub-events like diagnostics, medications, and procedures. On the right we have an artistic
expression of the knowledge bases, in this case we show some core diagnostic concepts and
their relationships in the taxonomy space.
Figure 1.
If we then look at Figure 2 we see the same tri-partition: the customer is a core entity in our
telecom database. Communication events are in the middle, and the knowledge base (showing
place names in the USA) is on the right. The place name KB is linked to the open data graph
from geonames.org.
Figure 2.
With our entity-event approach extracting data from this EDW became a one time mapping
process. Potentially every table/column is mapped to a particular event-type and event-property
and this mapping is stored declaratively. From that moment on this mapping can be used to
populate the entity-event tree.
The most important insight here is that this needs to be done only once. All future reporting,
analytics, and feature extraction for machine learning can be done on top of this knowledge
graph infrastructure without ever creating separate data marts.
G. Security
In some use cases, especially in healthcare and finance it is very important to make sure that
only the right people can see the right data. AllegroGraph has a mechanism called triple-
attributes for ensuring strict security policies. It was built according to requirements of the DOD
and Intelligence agencies and is described in this tutorial. With this mechanism we can protect
every fact in the Knowledge Graph. AllegroGraph and thereby Knowledge Graphs built in
AllegroGraph are the only one that can provide this level of security.