Stefan Armbruster Data Modelling With Graphs
Stefan Armbruster Data Modelling With Graphs
1
1
33
is a
44
NOSQL
55
Graph Database
66
A graph database...
77
You know relational
88
now consider relationships...
99
We're talking about a
Property Graph
1
1
1
1
NOSQL Databases
11
22
Aggregate Oriented Model
“There is a significant downside - the whole
approach works really well when data access is
aligned with the aggregates, but what if you want
to look at the data in a different way? Order entry
naturally stores orders as aggregates, but
analyzing product sales cuts across the aggregate
structure. The advantage of not using an
aggregate structure in the database is that it
allows you to slice and dice your data different
ways for different audiences.
Michael Hunger
1
1
4
4
Data Modeling
1
1
5
5
Why Data Modeling
๏What is modeling?
๏Aren‘t we schema free?
๏How does it work in a graph?
๏Where should modeling
happen? DB or Application
11
66
Data Models
1
1
7
7
Model mis-match
Real World Model
Model mis-match
Andre Peter
knows
as
knows knows
Alliso
knows n
Emil
22
11
You traverse the graph
// then
lookup
traverse
starting
topoint
find results
in an index
START me=node:People(name
n=node:People(name ==‘Andreas’)
‘Andreas’
MATCH (me)-[:FRIEND]-(friend)-[:FRIEND]-(friend2)
RETURN friend2
22
22
START user = node(1)
MATCH user -[user_skill]-> skill
RETURN skill, user_skill
SELECT skills.*, user_skill.*
FROM users
JOIN user_skill ON users.id = user_skill.user_id
JOIN skills ON user_skill.skill_id = skill.id WHERE users.id = 1
2
2
3
3
An Example
2
2
4
4
What language do they speak here?
Language Country
What language do they speak here?
Language Country
What language do they speak here?
Language Country
Tables
Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count
Need to model the relationship
Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count language_cod
e
What if the cardinality changes?
Language Country
language_code country_code
language_nam country_name
e flag_uri
word_count
country_code
Or we go many-to-many?
LanguageCountr
Language Country
y
language_code language_cod country_code
language_nam e country_name
e country_code flag_uri
word_count
Or we want to qualify the
relationship?
Language LanguageCountry Country
language_code language_code country_code
language_nam country_code country_name
e primary flag_uri
word_count
Start talking about
Graphs
Explicit Relationship
Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
Relationship Properties
Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
as_primary
What’s different?
IS_SPOKEN_IN
LanguageCountr
Language Country
y
language_code language_code country_code
language_nam country_code country_name
e primary flag_uri
word_count
What’s different?
๏ Implementation of maintaining relationships is left up
to the database
๏ Artificial keys disappear or are unnecessary
๏ Relationships get an explicit name
• can be navigated in both directions
Relationship specialisation
Language Country
name name
word_count flag_uri
IS_SPOKEN_IN
as_primary
Bidirectional relationships
Language Country
name name
IS_SPOKEN_IN
word_count flag_uri
PRIMARY_LANGUAGE
Weighted relationships
Language Country
name name
word_count flag_uri
POPULATION_SPEAKS
population_fraction
Keep on adding relationships
Language Country
name name
word_count flag_uri
POPULATION_SPEAKS
population_fraction
SIMILAR_TO ADJACENT_TO
EMBRACE the
paradigm
Use the building blocks
๏ Nodes
๏ Relationships RELATIONSHIP_NAME
name: “Canada”
languages_spoken: “[ ‘English’, ‘French’ ]”
Normalize Nodes
Anti-Pattern: Node represents
multiple concepts
Country
name
flag_uri
language_name
number_of_words
yes_in_language
no_in_language
currency_code
currency_name
Split up in separate concepts
Country Country
name name
flag_uri SPEAKS number_of_words
currency_code yes
currency_name no
Currency
currency_code
currency_name
USES_CURRENCY
Challenge: Property or Relationship?
๏ Can every property be replaced by a relationship?
๏ Should every entities with the same property values
be connected?
Object Mapping
๏ Similar to how you would map objects to a relational
database, using an ORM such as Hibernate
๏ Generally simpler and easier to reason about
๏ Examples
• Java: Spring Data Graph
• Ruby: Active Model
๏ Why Map?
• Do you use mapping because you are scared of
SQL?
• Following DDD, could you write your repositories
directly against the graph API?
CONNECT for fast
access
In-Graph Indices
Relationships for querying
๏ like in other databases
• same structure for different use-cases (OLTP and
OLAP) doesn‘t work
• graph allows: add more structures
๏ Relationships should the primary means to access
nodes in the database
๏ Traversing relationships is cheap – that’s the whole
design goal of a graph database
๏ Use lookups only to find starting nodes for a query
name: “Jones”
name: “Jones”
name: “Jones”
name: “Jones”
Pattern: Linked List
55
33
Pattern: Multiple Relationships
55
44
Pattern-Trees:Tags and Categories
55
55
Pattern-Tree: Multi-Level-Tree
55
66
Pattern-Trees: R-Tree (spatial)
55
77
Example: Activity Stream
55
88
Graph Evolution
5
5
9
9
Evolution: Relationship to Node
SENT_EMAIL
...
ED
G G
TA
EMAIL_TO
EMAIL_FROM
CC
IL_
A
EM
see Hyperedges
66
00
Combine multiple Domains in a
Graph
๏ you start with a single domain
๏ add more connected domains as your system evolves
๏ more domains allow to ask different queries
๏ one domain „indexes“ the other
๏ Example Facebook Graph Search
• social graph
• location graph
• activity graph
• favorite graph
• ...
Notes on the Graph Data Model
๏ Schema free, but constraints
๏ Model your graph with a whiteboard and a wise man
๏ Nodes as main entities but useless without connections
๏ Relationships are first level citizens in the model and database
๏ Normalize more than in a relational database
๏ use meaningful relationship-types, not generic ones like IS_
๏ use in-graph structures to allow different access paths
๏ evolve your graph to your needs, incremental growth
66
22
How to get started?
๏ Documentation
• neo4j.org
‣http://www.neo4j.org/learn/nosql
• docs.neo4j.org - tutorials+reference
‣Data Modeling Examples
• http://console.neo4j.org
• Neo4j in Action
• Good Relationships
๏ Worldwide one-day Neo4j Trainings
๏ Get Neo4j
• http://neo4j.org/download
• http://addons.heroku.com/neo4j/
๏ Participate 66
88
Really, once you start
thinking in graphs
it's hard to stop
7
7
0
0