NoSQL Databases
Principles
Ecole d’Ingénierie Digitale et d’Intelligence Artificielle (EIDIA)
Cycle Préparatoire formation Ingénieur
Khawla TADIST
Année 2023-2024
Outline
Different aspects of data distribution
Scaling
Vertical vs. horizontal
Distribution models
Sharding
Replication
Master-slave vs. peer-to-peer architectures
CAP properties
Consistency, Availability and Partition tolerance
ACID vs. BASE
Scalability
What is scalability?
Capability of a system to handle growing amounts of data and/or queries without
losing performance,
Or its potential to be enlarged in order to accommodate such a growth.
Two general approaches
Vertical scaling
Horizontal scaling
Scalability - Vertical Scalability
Vertical scaling (scaling up/down)
Adding resources to a single node in a system
Increasing the number of CPUs,
Extending system memory,
Using larger disk arrays,
…
i.e. larger and more powerful machines are involved
Scalability - Vertical Scalability
Vertical scaling (scaling up/down)
Traditional choice
In favor of strong consistency
Easy to implement and deploy
No issues caused by data distribution
…
Works well in many cases but …
Scalability - Vertical Scalability Drawbacks
Performance limits
Even the most powerful machine has a limit
Moreover, everything works well… unless we start approaching such limits
Higher costs
The cost of expansion increases exponentially
In particular, it is higher than the sum of costs of equivalent commodity
hardware
Scalability - Vertical Scalability Drawbacks
Proactive provisioning
New projects/applications might evolve rapidly
Upfront budget is needed when deploying new machines
And so flexibility is seriously suppressed
Scalability - Vertical Scalability Drawbacks
Vendor lock-in
There are only a few manufacturers of large machines
Customer is made dependent on a single vendor
Their products, services, but also implementation details, proprietary formats,
interfaces, …
i.e. it is difficult or impossible to switch to another vendor
Deployment downtime
Inevitable downtime is often required when scaling up
Scalability - Horizontal Scalability
Horizontal scaling (scaling out/in)
Adding more nodes to a system
i.e. the system is distributed across multiple nodes in a cluster
Choice of many NoSQL systems
Scalability - Horizontal Scalability
Horizontal scaling (scaling out/in)
Advantages
Commodity hardware, cost effective
Flexible deployment and maintenance
Often surpasses the vertical scaling
…
Unfortunately, there are also plenty of false assumptions…
Scalability - Horizontal Scalability
Drawbacks
False assumptions
Network is reliable
Network is secure
Latency is zero
Bandwidth is infinite
Topology does not change
There is one administrator
Transport cost is zero
Scalability - Horizontal Scalability
Consequences
Significantly increases complexity
Complexity of management,
Programming model, …
Introduces new issues and problems
Synchronization of nodes
Data distribution
Data consistency
Recovery from failures
…
Scalability - Horizontal Scalability
A standalone node still might be a better option in certain cases
e.g. for graph databases
Simply because it is difficult to split and distribute graphs
In other words
It can make sense to run even a NoSQL database system on a single node
No distribution at all is the most preferred/simple scenario
But in general, horizontal scaling does open new possibilities
Scalability - Horizontal Scalability
Architecture
What is a cluster?
A collection of mutually interconnected commodity nodes
Based on the shared-nothing architecture
Nodes do not share their CPUs, memory, hard drives,…
Each node runs its own operating system instance
Nodes send messages to interact with each other
Nodes of a cluster can be heterogeneous
Data, queries, computation, workload, …
This is all distributed among the nodes within a cluster
Distribution Models
Generic techniques of data distribution
Sharding
Different data on different nodes
Motivation: increasing volume of data, increasing performance
Replication
Copies of the same data on different nodes
Motivation: increasing performance, increasing fault tolerance
Distribution Models
Both the techniques are mutually orthogonal
i.e. we can use either of them, or combine them both
NoSQL systems often offer automatic sharding and replication
Distribution Models - Sharding
Sharding (horizontal partitioning)
Placement of different data on different nodes
What does different data mean? Different aggregates
– E.g. key-value pairs, documents, …
Distribution Models - Sharding
Sharding (horizontal partitioning)
Placement of different data on different nodes
Related pieces of data that are accessed together should also be kept together
– Specifically, operations involving data on multiple shards should be
avoided
Distribution Models - Sharding
Sharding (horizontal partitioning)
The questions are…
How to design aggregate structures?
How to actually distribute these aggregates?
Distribution Models - Sharding
Sharding (horizontal partitioning)
Distribution Models - Sharding
Objectives
Uniformly distributed data (volume of data)
Balanced workload (read and write requests)
Respecting physical locations
e.g. different data centers for users around the world
…
Unfortunately, these objectives…
May mutually contradict each other
May change in time
Distribution Models - Sharding
Sharding (horizontal partitioning)
Source: Sadalage, Pramod J. - Fowler, Martin: NoSQL Distilled. Pearson Education, Inc., 2013.
Distribution Models - Sharding
How to actually determine shards for aggregates?
We not only need to be able to place new data when handling write requests,
But also find the data in case of read requests
i.e. when a given search criterion is provided (e.g. key, id, …),
Distribution Models - Sharding
How to actually determine shards for aggregates?
We must be able to determine the corresponding shard to the given key
So that the requested data can be accessed and returned,
Or failure can be correctly detected when the data is missing
Distribution Models - Sharding
Sharding strategies
Based on mapping structures
Placing of data on shards in a random fashion (e.g. round-robin) (Not suitable)
Based on general rules:
Hash partitioning,
Range partitioning
Distribution Models - Replication
Replication
Placement of multiple copies – replicas – of the same data on different nodes
Replication factor = the number of copies
Two approaches:
Master-slave architecture
Peer-to-peer architecture
Distribution Models - Replication - Master-
Slave
Master-Slave Architecture
Source: Sadalage, Pramod J. - Fowler, Martin: NoSQL Distilled. Pearson Education, Inc., 2013.
Distribution Models - Replication - Master-
Slave
Architecture
One node is primary (master), all the other secondary (slave)
Master node bears all the management responsibility
All the nodes contain identical data
Distribution Models - Replication - Master-
Slave
Architecture
Read requests can be handled by both the master or slaves
Suitable for read-intensive applications
More read requests to deal with → more slaves to deploy
When the master fails, read operations can still be handled
Distribution Models - Replication - Master-
Slave
Write requests can only be handled by the master
Newly written replicas are propagated to all the slaves
Consistency issue
Luckily enough, at most one write request is handled at a time
But the propagation still takes some time during which obsolete reads might happen
Hence certain synchronization is required to avoid conflicts
Distribution Models - Replication - Master-
Slave
In case of master failure, a new one needs to be appointed
Manually (user-defined)
Automatically (cluster-elected)
Since the nodes are identical, appointment can be fast
Master might therefore represent a bottleneck (because of the performance or
failures)
Distribution Models - Replication - Peer-to-
Peer
Peer-to-Peer Architecture
Source: Sadalage, Pramod J. - Fowler, Martin: NoSQL Distilled. Pearson Education, Inc., 2013.
Distribution Models - Replication
Architecture
All the nodes have equal roles and responsibilities
All the nodes contain identical data once again
Distribution Models - Replication
Both read and write requests can be handled by any node
No bottleneck, no single point of failure
More requests to deal with → more nodes to deploy
Distribution Models - Replication
Both read and write requests can be handled by any node
Consistency issues
Unfortunately, multiple write requests can be initiated independently and handled at
the same time
Hence synchronization is required to avoid conflicts
Distribution Models - Sharding and
Replication
Observations with respect to replication:
Does the replication factor really need to correspond to the number of nodes?
No, replication factor of 3 will often be the right choice
Consequences
– Nodes will no longer contain identical data
– Replica placement strategy will be needed
Sharding and replication can be combined… but how?
Distribution Models - Sharding and
Replication
Combinations of sharding and replication
Sharding + master-slave replication
Multiple masters, each for different data
Roles of the nodes can overlap
– Each node can be master for some data
and/or slave for other
Distribution Models - Sharding and
Replication
Combinations of sharding and replication
Sharding + peer-to-peer replication
Placement of anything anywhere
CAP Theorem
Assumptions
System with sharding and replication
Read and write operations on a single aggregate
CAP properties = properties of a distributed system
Consistency
Availability
Partition tolerance
CAP Theorem
CAP theorem
It is not possible to have a distributed system that would guarantee consistency,
availability, and partition tolerance at the same time.
Only 2 of these 3 properties can be enforced.
But, what do these properties actually mean?
CAP Theorem - Properties
Consistency
Read and write operations must be executed atomically
There must exist a total order on all operations such that each operation looks as if it
was completed at a single instant,
i.e. as if all the operations were executed one by one on a single standalone node
CAP Theorem - Properties
Consistency
Practical consequence: After a write operation, all readers see the same data
Since any node can be used for handling of read requests, atomicity of write
operations means that changes must be propagated to all the replicas
CAP Theorem - Properties
Availability
If a node is working, it must respond to user requests
Every read or write request received by a non-failing node in the system must result in a
response
CAP Theorem - Properties
Partition tolerance
System continues to operate even when two or more sets of nodes get isolated
i.e. a connection failure MUST NOT shut the whole system down
CAP Theorem - Consequences
At most two properties can be guaranteed
CA = Consistency + Availability
CP = Consistency + Partition tolerance
AP = Availability + Partition tolerance
CAP Theorem - Consequences
If at most two properties can be guaranteed…
CA = Consistency + Availability
Traditional ACID properties are easy to achieve
Examples: RDBMS, Google BigTable
Any single-node system
However, should the network partition happen, all the nodes must be forced to stop
accepting user requests
CAP Theorem - Consequences
If at most two properties can be guaranteed…
CP = Consistency + Partition tolerance
Examples: MongoDB, HBase
CAP Theorem - Consequences
If at most two properties can be guaranteed…
AP = Availability + Partition tolerance
New concept of BASE properties
Examples: Apache Cassandra, Apache CouchDB
Other examples: web caching, DNS
CAP Theorem - Consequences
Partition tolerance is necessary in clusters
Why?
Because it is difficult to detect network failures
Does it mean that only purely CP and AP systems are possible?
No…
CAP Theorem - Consequences
The real meaning of the CAP theorem:
Partition tolerance is a MUST,
But we can trade off consistency versus availability
Just a little bit relaxed consistency can bring a lot of availability
Such trade-offs are not only possible, but often work very well in practice
ACID Properties
Traditional ACID properties
Atomicity
Partial execution of transactions is not allowed (all or nothing)
Consistency
Transactions bring the database from one consistent (valid) state to another
Isolation
Although multiple transactions may execute in parallel, each transaction must take into
consideration the execution of the other
Durability
Effects of committed transactions must remain durable
BASE Properties
New concept of BASE properties
Basically Available
The system works basically all the time
Partial failures can occur, but without total system failure
Soft State
The system is in flux (unstable)
Changes occur all the time
Eventual Consistency
Sooner or later the system will be in some consistent state
ACID and BASE
ACID
Choose consistency over availability
Pessimistic approach
Implemented by traditional relational databases
BASE
Choose availability over consistency
Optimistic approach
Common in NoSQL databases
Allows levels of scalability that cannot be acquired with ACID
Current trend in NoSQL:
Strong consistency → eventual consistency
Consistency
Consistency in general…
Consistency is the lack of contradiction in the database
Strong consistency is achievable even in clusters, but eventual consistency might
often be sufficient
Even when an already unavailable hotel room is booked once again, the situation can be
figured out in the real world
…
Consistency
Write consistency (update consistency)
Problem: write-write conflict
Two or more write requests on the same aggregate are initiated concurrently
Issue: lost update
Question: Do we need to solve the problem in the first place?
Consistency
Write consistency (update consistency)
Question: Do we need to solve the problem in the first place?
If yes, than there are two general solutions
Pessimistic approaches
Preventing conflicts from occurring
Techniques: write locks, …
Optimistic approaches
Conflicts may occur, but are detected and resolved later on
Techniques: version stamps, …
Consistency
Read consistency (replication consistency)
Problem: read-write conflict
Write and read requests on the same aggregate are initiated concurrently
Issue: inconsistent read
When not treated, inconsistency window will exist
Propagation of changes to all the replicas takes some time
Until this process is finished, inconsistent reads may happen even the initiator of the
write request may read wrong data!
Conclusion
There is a wide range of options influencing…
Scalability
– how well the system scales (data and requests)?
Availability
– when nodes may refuse to handle user requests?
Consistency
– what level of consistency is required?
Latency
– how complicated is to handle user requests?
Durability
– are the committed data written reliably?
Resilience
– can the data be recovered in case of failures?
It’s good to know these properties and choose the right trade-off