0% found this document useful (0 votes)
36 views46 pages

Graph Neo4j

Uploaded by

Sahil Suvagiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views46 pages

Graph Neo4j

Uploaded by

Sahil Suvagiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

NoSQL: Graph Databases

Databases

Why NoSQL Databases?


Trends in Data
Data is getting bigger:
“Every 2 days we
create as much
information as we did
up to 2003”

– Eric Schmidt, Google


Data is more connected:
• Text
• HyperText
• RSS
• Blogs
• Tagging
• RDF
Trend 2: Connectedness GGG
Onotologies

RDFa

Folksonomies
Information connectivity

Tagging

Wikis

UGC

Blogs

Feeds

Hypertext
Text
Documents
Data is more Semi-Structured:
• If you tried to collect all the data of every
movie ever made, how would you model it?
• Actors, Characters, Locations, Dates, Costs,
Ratings, Showings, Ticket Sales, etc.
Architecture Changes Over Time
1980’s: Single Application

Application

DB
Architecture Changes Over Time
1990’s: Integration
Database Antipattern

Application Application Application

DB
Architecture Changes Over Time
2000’s: SOA

RESTful, hypermedia, composite apps

Application Application Application

DB DB DB
Side note: RDBMS performance
Salary list

Most Web apps

Social Network

Location-based services
NOSQL
Not Only SQL
Less than 10% of the NOSQL Vendors
Key Value Stores
• Came from a research article written by
Amazon (Dynamo)
– Global Distributed Hash Table
• Global collection of key value pairs
Four NOSQL Categories
Key Value Stores
• Most Based on Dynamo: Amazon Highly
Available Key-Value Store
• Data Model:
– Global key-value mapping
– Big scalable HashMap
– Highly fault tolerant (typically)
• Examples:
– Redis, Riak, Voldemort
Key Value Stores: Pros and Cons
• Pros:
– Simple data model
– Scalable
• Cons
– Poor for complex data
Column Family
• Most Based on BigTable: Google’s Distributed
Storage System for Structured Data
• Data Model:
– A big table, with column families
• Every row can have its own schema
• Helps capture more “messy” data
– Map Reduce for querying/processing
• Examples:
– HBase, HyperTable, Cassandra
Column Family: Pros and Cons
• Pros:
– Supports Simi-Structured Data
– Naturally Indexed (columns)
– Scalable
• Cons
– Poor for interconnected data
Document Databases
• Inspired by Lotus Notes
– Collection of Key value pair collections (called
Documents)
Document Databases
• Data Model:
– A collection of documents
– A document is a key value collection
– Index-centric, lots of map-reduce
• Examples:
– CouchDB, MongoDB
Document Databases: Pros and Cons
• Pros:
– Simple, powerful data model
– Scalable
• Cons
– Poor for interconnected data
– Query model limited to keys and indexes
– Map reduce for larger queries
Graph Databases
• Data Model:
– Nodes and Relationships
• Examples:
– Neo4j, OrientDB, InfiniteGraph, AllegroGraph
Graph Databases: Pros and Cons
• Pros:
– Powerful data model, as general as RDBMS
– Connected data locally indexed
– Easy to query
• Cons
– Sharding ( lots of people working on this)
• Scales UP reasonably well
– Requires rewiring your brain
What are graphs good for?
• Recommendations
• Business intelligence
• Social computing
• Geospatial
• Systems management
• Web of things
• Genealogy
• Time series data
• Product catalogue
• Web analytics
• Scientific computing (especially bioinformatics)
• Indexing your slow RDBMS
• And much more!
What is a Graph?
What is a Graph?
• An abstract representation of a set of objects
where some pairs are connected by links.

Object (Vertex, Node)

Link (Edge, Arc, Relationship)


Different Kinds of Graphs
• Undirected Graph
• Directed Graph

• Pseudo Graph
• Multi Graph

• Hyper Graph
More Kinds of Graphs
• Weighted Graph

• Labeled Graph

• Property Graph
What is a Graph Database?
• A database with an explicit graph structure
• Each node knows its adjacent nodes
• As the number of nodes increases, the cost of
a local step (or hop) remains the same
• Plus an Index for lookups
Relational Databases
Graph Databases
Neo4j Tips
• Each entity table is represented by a label on
nodes
• Each row in a entity table is a node
• Columns on those tables become node
properties.
• Join tables are transformed into relationships,
columns on those tables become relationship
properties
Node in Neo4j
Relationships in Neo4j
• Relationships between nodes are a key part of
Neo4j.
Relationships in Neo4j
Twitter and relationships
Properties
• Both nodes and relationships can have
properties.
• Properties are key-value pairs where the key is
a string.
• Property values can be either a primitive or an
array of one primitive type.
For example String, int and int[] values are
valid for properties.
Properties
Paths in Neo4j
• A path is one or more nodes with connecting
relationships, typically retrieved as a query or
traversal result.
Starting and Stopping
Creating a small graph
Print the data
Remove the data
The Matrix Graph Database

You might also like