Module 5 Nosql
Module 5 Nosql
MODULE 5
Graph Databases
A graph database is a specialized database optimized for storing and querying data
represented as graphs. It consists of two main components: nodes (entities)
and edges (relationships), along with their associated properties.
Nodes
Each node can have properties stored as key-value pairs, providing descriptive
information.
Nodes can be thought of as the fundamental building blocks of the graph structure.
Edges
Relationships are directional and can carry meaning based on their direction. For
instance, a "likes" relationship implies one-way affinity, while a "friend" relationship
might be bidirectional.
Edges can also have properties, enabling richer metadata to be stored about the
relationship.
2. Organizational Advantages
Graph databases allow for flexible organization of data. The relationships between nodes are
explicitly stored, enabling the discovery of complex patterns. This explicit storage facilitates
efficient queries without the need for extensive computation or schema changes.
Nodes and edges are stored once, but they can be interpreted and queried in various
ways.
This flexibility supports evolving data models, unlike rigid schemas in relational
databases.
1
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Graph traversal is efficient because relationships are stored persistently rather than
being computed dynamically during a query.
Join Operations: Relational databases use joins to connect data across tables, which
can be slow for complex queries. In graph databases, relationships are explicitly
stored, making traversal fast and efficient.
Model Flexibility: Graph databases are not limited to a single type of relationship.
Nodes can have diverse and numerous connections, allowing for richer
representations of complex domains.
5. Applications
Graph databases are well-suited for domains with intricate, interconnected data, such
as social networks, recommendation systems, fraud detection, knowledge graphs, and
supply chain management.
6. Performance Benefits
Graph databases, exemplified by tools such as Neo4j, OrientDB, and FlockDB, offer a robust
way to model and analyze interconnected data.
Consistency in graph databases is crucial due to their reliance on tightly interconnected nodes
and relationships.
Single-Server Consistency
Most graph databases do not distribute nodes across multiple servers, focusing instead
on maintaining data consistency within a single server.
2
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Cluster Consistency
Some solutions, like Infinite Graph, support node distribution across a server cluster.
Neo4j also supports clustering with specific behaviors:
Slave nodes are always available for reads, even if data propagation is
delayed.
Write operations on slave nodes are synchronized to the master, but other
slaves only update when the master propagates the data.
Dangling Relationships
3
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
2. Transaction Workflow:
Operations (e.g., creating nodes, setting properties) are performed within the
transaction.
If a transaction is not marked as successful, Neo4j assumes a failure and rolls back
the changes when finish() is called.
Merely marking a transaction as successful without finishing it does not commit the
changes.
This explicit transaction management differs from traditional RDBMS systems, where
commit and rollback mechanisms are more implicit.
Provides a clear mechanism for handling success and failure, offering developers
fine-grained control.
4
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
3 . Availability
Write synchronization:
4 . Query Features
1. Query Languages:
3. Traversals:
Order.BREADTH_FIRST,
StopEvaluator.END_OF_GRAPH,
ReturnableEvaluator.ALL_BUT_START_NODE,
EdgeType.FRIEND,
5
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Direction.OUTGOING
);
4. Pathfinding:
All paths:
);
WHERE <conditions>
RETURN <results>
ORDER BY <ordering>
SKIP <records>
5. Scaling
6
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Increase RAM:
Use a master node for writes and multiple slave nodes for reads.
Practical for datasets that cannot fit into a single machine’s memory
but are small enough to replicate across machines.
For example:
1. Connected Data:
Examples:
Applications:
7
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
3. Recommendation Engines:
Advantages:
Mass Updates:
Not suitable for operations requiring updates to all or many entities (e.g.,
analytics requiring global property changes).
Graph Database
8
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
9
Koustav Biswas, Dept. Of CSE, DSATM