0% found this document useful (0 votes)
87 views64 pages

Stefan Armbruster Data Modelling With Graphs

A graph database stores data that is structured as a graph, with nodes connected by relationships. It allows data to be accessed in different ways by slicing and dicing the data through relationships. In a graph database, data is modeled using nodes connected by relationships with properties, rather than tables joined by foreign keys. This allows the schema to be more flexible and the relationships to change without affecting the data model.

Uploaded by

Debashis Mallick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views64 pages

Stefan Armbruster Data Modelling With Graphs

A graph database stores data that is structured as a graph, with nodes connected by relationships. It allows data to be accessed in different ways by slicing and dicing the data through relationships. In a graph database, data is modeled using nodes connected by relationships with properties, rather than tables joined by foreign keys. This allows the schema to be more flexible and the relationships to change without affecting the data model.

Uploaded by

Debashis Mallick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Data Modeling with Neo4j

Stefan Armbruster, Neo Technology


(slides from Michael Hunger)

1
1
33
is a

44
NOSQL

55
Graph Database

66
A graph database...

NO: not for charts & diagrams, or vector artwork


YES: for storing data that is structured as a graph
remember linked lists, trees?
graphs are the general-purpose data structure
“A relational database may tell you the average age of everyone in
this place,
but a graph database will tell you who is most likely to buy you a
beer.”

77
You know relational

foo foo_bar bar

88
now consider relationships...

99
We're talking about a
Property Graph

Properties (each a key+value)

+ Indexes (for easy look-ups)


11
00
Aggregate vs. Connected
Data-Model

1
1
1
1
NOSQL Databases

11
22
Aggregate Oriented Model
“There is a significant downside - the whole
approach works really well when data access is
aligned with the aggregates, but what if you want
to look at the data in a different way? Order entry
naturally stores orders as aggregates, but
analyzing product sales cuts across the aggregate
structure. The advantage of not using an
aggregate structure in the database is that it
allows you to slice and dice your data different
ways for different audiences.

This is why aggregate-oriented stores talk so


much about map-reduce.”
Martin Fowler 1
1
3
3
Connected Data Model
The connected data model is based on fine
grained elements that are richly connected, the
emphasis is on extracting many dimensions and
attributes as elements.
Connections are cheap and can be used not only
for the domain-level relationships but also for
additional structures that allow efficient access for
different use-cases. The fine grained model
requires a external scope for mutating operations
that ensures Atomicity, Consistency, Isolation and
Durability - ACID also known as Transactions.

Michael Hunger
1
1
4
4
Data Modeling

1
1
5
5
Why Data Modeling
๏What is modeling?
๏Aren‘t we schema free?
๏How does it work in a graph?
๏Where should modeling
happen? DB or Application

11
66
Data Models

1
1
7
7
Model mis-match
Real World Model
Model mis-match

Application Model Database Model


Trinity of models
Whiteboard --> Data

Andre Peter
knows
as

knows knows
Alliso
knows n
Emil

// Cypher query - friend of a friend


start n=node(0)
match (n)--()--(foaf)
return foaf

22
11
You traverse the graph
// then
lookup
traverse
starting
topoint
find results
in an index
START me=node:People(name
n=node:People(name ==‘Andreas’)
‘Andreas’
MATCH (me)-[:FRIEND]-(friend)-[:FRIEND]-(friend2)
RETURN friend2

22
22
START user = node(1)
MATCH user -[user_skill]-> skill
RETURN skill, user_skill
SELECT skills.*, user_skill.*
FROM users
JOIN user_skill ON users.id = user_skill.user_id
JOIN skills ON user_skill.skill_id = skill.id WHERE users.id = 1

2
2
3
3
An Example

2
2
4
4
What language do they speak here?

Language Country
What language do they speak here?

Language Country
What language do they speak here?

Language Country
Tables

Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count
Need to model the relationship

Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count language_cod
e
What if the cardinality changes?

Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count
country_code
Or we go many-to-many?
LanguageCountr
Language Country
y
language_code language_cod country_code
language_nam e country_name
e country_code flag_uri
word_count
Or we want to qualify the
relationship?
Language LanguageCountry Country
language_code language_code country_code
language_nam country_code country_name
e primary flag_uri
word_count
Start talking about
Graphs
Explicit Relationship

Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
Relationship Properties

Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
as_primary
What’s different?
IS_SPOKEN_IN
LanguageCountr
Language Country
y
language_code language_code country_code
language_nam country_code country_name
e primary flag_uri
word_count
What’s different?
๏ Implementation of maintaining relationships is left up
to the database
๏ Artificial keys disappear or are unnecessary
๏ Relationships get an explicit name
• can be navigated in both directions
Relationship specialisation

Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
as_primary
Bidirectional relationships

Language Country
name name
IS_SPOKEN_IN
word_count flag_uri

PRIMARY_LANGUAGE
Weighted relationships

Language Country
name name
word_count flag_uri
POPULATION_SPEAKS
population_fraction
Keep on adding relationships

Language Country
name name
word_count flag_uri
POPULATION_SPEAKS
population_fraction

SIMILAR_TO ADJACENT_TO
EMBRACE the
paradigm
Use the building blocks

๏ Nodes

๏ Relationships RELATIONSHIP_NAME

๏ Properties name: value


Anti-pattern: rich properties

name: “Canada”
languages_spoken: “[ ‘English’, ‘French’ ]”
Normalize Nodes
Anti-Pattern: Node represents
multiple concepts
Country
name
flag_uri
language_name
number_of_words
yes_in_language
no_in_language
currency_code
currency_name
Split up in separate concepts

Country Country
name name
flag_uri SPEAKS number_of_words
currency_code yes
currency_name no

Currency
currency_code
currency_name
USES_CURRENCY
Challenge: Property or Relationship?
๏ Can every property be replaced by a relationship?
๏ Should every entities with the same property values
be connected?
Object Mapping
๏ Similar to how you would map objects to a relational
database, using an ORM such as Hibernate
๏ Generally simpler and easier to reason about
๏ Examples
• Java: Spring Data Graph
• Ruby: Active Model
๏ Why Map?
• Do you use mapping because you are scared of
SQL?
• Following DDD, could you write your repositories
directly against the graph API?
CONNECT for fast
access
In-Graph Indices
Relationships for querying
๏ like in other databases
• same structure for different use-cases (OLTP and
OLAP) doesn‘t work
• graph allows: add more structures
๏ Relationships should the primary means to access
nodes in the database
๏ Traversing relationships is cheap – that’s the whole
design goal of a graph database
๏ Use lookups only to find starting nodes for a query

Data Modeling examples in Manual


Anti-pattern: unconnected graph

name: “Jones” name: “Jones”


name: “Jones”

name: “Jones” name: “Jones”

name: “Jones”

name: “Jones” name: “Jones”

name: “Jones”
name: “Jones”
name: “Jones”
Pattern: Linked List

55
33
Pattern: Multiple Relationships

55
44
Pattern-Trees:Tags and Categories

55
55
Pattern-Tree: Multi-Level-Tree

55
66
Pattern-Trees: R-Tree (spatial)

55
77
Example: Activity Stream

55
88
Graph Evolution

5
5
9
9
Evolution: Relationship to Node

SENT_EMAIL

...

ED
G G
TA

EMAIL_TO
EMAIL_FROM

CC
IL_
A
EM

see Hyperedges
66
00
Combine multiple Domains in a
Graph
๏ you start with a single domain
๏ add more connected domains as your system evolves
๏ more domains allow to ask different queries
๏ one domain „indexes“ the other
๏ Example Facebook Graph Search
• social graph
• location graph
• activity graph
• favorite graph
• ...
Notes on the Graph Data Model
๏ Schema free, but constraints
๏ Model your graph with a whiteboard and a wise man
๏ Nodes as main entities but useless without connections
๏ Relationships are first level citizens in the model and database
๏ Normalize more than in a relational database
๏ use meaningful relationship-types, not generic ones like IS_
๏ use in-graph structures to allow different access paths
๏ evolve your graph to your needs, incremental growth

66
22
How to get started?
๏ Documentation

• neo4j.org
‣http://www.neo4j.org/learn/nosql

• docs.neo4j.org - tutorials+reference
‣Data Modeling Examples

• http://console.neo4j.org
• Neo4j in Action
• Good Relationships
๏ Worldwide one-day Neo4j Trainings
๏ Get Neo4j

• http://neo4j.org/download
• http://addons.heroku.com/neo4j/
๏ Participate 66
88
Really, once you start
thinking in graphs
it's hard to stop

What will you build?


Recommendations MDM
Business intelligence
Geospatial
catalogs Systems
access control Social computing
Management
your brain
Biotechnology
routing genealogy
linguistics
Making Sense of all that
compensation
data market vectors
66
99
Thank You!
Questions ?

7
7
0
0

You might also like