0% found this document useful (0 votes)

21 views47 pages

NoSQL M2

Uploaded by

balasubramani2234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views47 pages

NoSQL M2

Uploaded by

balasubramani2234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

NoSQL Database (21CS745)

Module -2 : Introduction to NoSQL

Dr. Rama Satish K V

Associate Professor
Department of AI & ML
RNSIT, Bengaluru
Syllabus 2
Introduction to Distributed models 3

• NoSQL databases are increasingly popular due to their capability to run on large
clusters, making them suitable for handling growing data volumes.
• Scaling Options: Instead of scaling up with larger servers, NoSQL allows for
scaling out by distributing databases across clusters of servers.
• Aggregate Orientation: The use of aggregates aligns well with scaling out, serving
as a natural unit for data distribution.
• Data Distribution Benefits: Effective distribution models can enhance data storage
capacity, improve read/write performance, and increase availability during network
issues.
• Distribution Techniques: Two main methods for data distribution are replication
and sharding (distributing different data across nodes).
• Replication Types: Replication can be implemented in two forms: master-slave and
peer-to-peer, with the discussion covering single-server setups, master-slave
replication, sharding, and peer-to-peer replication.
Single-Server Database Distribution in NoSQL 4

• Simplicity of Single-Server Model: The simplest and often recommended

approach is to run the database on a single machine, which simplifies management
and development.
• Elimination of Complexities: A single-server setup avoids the complexities
associated with distributed systems, making it easier for operations teams and
application developers to work with.
• Suitability for Certain Applications: While many NoSQL databases are designed
for clusters, some data models, like graph databases, perform better in a single-
server configuration.
• Efficiency in Aggregate Processing: For applications focused on processing
aggregates, single-server document or key-value stores can be more efficient and
user-friendly for developers.
• Preference for Single-Server Approach: Despite exploring more complex
distribution schemes, the authors prefer the simplicity of a single-server model
whenever possible.
Understanding Sharding for NoSQL(1) 5

Definition of Sharding: Sharding is a

technique for horizontal scalability where
different parts of a dataset are distributed
across multiple servers, allowing for
simultaneous access by multiple users.
Understanding Sharding for NoSQL(2) 6

• Definition of Sharding: Sharding is a technique for horizontal scalability where

different parts of a dataset are distributed across multiple servers, allowing for
simultaneous access by multiple users.
• Ideal User-Server Interaction: In an ideal scenario, each user interacts with a
different server, leading to balanced load distribution and faster response times,
with each server handling an equal share of requests.
• Clumping Data: To approach the ideal distribution, related data should be
grouped together on the same server, which can be achieved through aggregate
orientation that combines frequently accessed data.
• Optimizing Data Placement: Placing data closer to its point of access, such as
location-based storage, can enhance performance, especially when certain
aggregates are accessed frequently.
• Maintaining Load Balance: It's important to evenly distribute aggregates across
nodes to prevent load imbalance, which can change over time based on user access
patterns.
Understanding Sharding for NoSQL(3) 7

• Sequential Data Access: Aggregates that are likely to be read in sequence can be
arranged to improve processing efficiency, as illustrated by the organization of
data in the Bigtable paper.
• Challenges of Manual Sharding: Historically, sharding has been managed
through application logic, complicating programming and requiring code changes
for rebalancing data across shards.
• Benefits of Auto-Sharding: Many NoSQL databases offer auto-sharding,
simplifying the distribution of data and allowing for more efficient application
development.
• Performance Enhancement: Sharding improves both read and write performance
by horizontally scaling the database, making it valuable for applications with high
write demands.
• Cautions with Sharding: While sharding can enhance performance, it can also
decrease resilience if not implemented carefully, and transitioning from a single-
server to a sharded configuration should be done proactively to avoid issues in
production.
Master-Slave Replication 8

Definition :This method

involves one primary
node (master) that
holds the authoritative
data, while secondary
nodes (slaves) replicate
that data.
Master-Slave Replication (2) 9

• Update Processing: The master is responsible for all data updates, and the slaves
are synchronized with it to ensure they reflect the latest data.
• Read Scalability: Master-slave replication is beneficial for read-intensive
datasets, allowing horizontal scaling by adding more slave nodes to handle
increased read requests.
• Limitations on Write Traffic: The master’s ability to process updates can become
a bottleneck in write-heavy environments, as it handles all write operations.
• Read Resilience: If the master fails, slaves can still process read requests,
providing continuity for read-heavy applications.
• Recovery Speed: In the event of a master failure, a slave can be quickly
promoted to master, facilitating faster recovery.
Master-Slave Replication (3) 10

• Hot Backup Functionality: The system can function like a single-server setup with a
hot backup, improving resilience without needing complex scaling.
• Master Appointment Methods: Masters can be appointed manually during
configuration or automatically through a cluster election process, enhances uptime.
• Separate Read and Write Paths: To achieve read resilience, applications should
have distinct paths for read and write operations, which may require specialized
database connections.
• Testing for Resilience: It's essential to conduct tests to ensure that reads can occur
even when writes are disabled, verifying the system's read resilience.
• Data Consistency Challenges: A major drawback of master-slave replication is the
potential for inconsistency; different clients might read different values if updates
haven’t fully propagated to the slaves.
• Risk of Data Loss: If the master fails before updates are replicated to the slaves,
those changes can be lost, emphasizing the importance of consistency and recovery
strategies.
Peer-to-Peer Replication (1) 11

• Limitations of Master-Slave
Replication: While master-slave
replication enhances read scalability, it
does not improve write scalability and
poses a single point of failure at the
master node.
• Introduction to Peer-to-Peer
Replication: Peer-to-peer (P2P)
replication eliminates the master node,
allowing all replicas to have equal
status, which can accept writes.
Peer-to-Peer Replication (2) 12

• Node Failure Resilience: In a P2P setup, the loss of any single node does not disrupt
access to the data store, enhancing overall data availability.
• Performance Enhancement: Adding more nodes in a P2P cluster can improve
performance without the bottleneck of a single master.
• Consistency Challenges: A major complication with P2P replication is maintaining
consistency, especially during simultaneous writes to the same record, leading to
write-write conflicts.
• Impact of Inconsistent Writes: While read inconsistencies are typically transient,
write inconsistencies can have permanent effects, complicating data integrity.
• Coordinating Writes for Consistency: One approach to handle write inconsistencies
involves coordinating writes across replicas, requiring majority agreement for a
valid update.
Peer-to-Peer Replication (3) 13

• Network Traffic Trade-off: Ensuring coordinated writes can increase network

traffic, which might impact performance.
• Coping with Inconsistent Writes: In some scenarios, it may be acceptable to allow
inconsistent writes and develop policies to merge them later, maximizing write
performance.
• Performance vs. Consistency: Choosing between strict coordination for consistency
or accepting inconsistencies involves balancing performance benefits against the
risks of data integrity issues.
• Operational Flexibility: P2P replication offers flexibility in managing nodes and
handling failures, making it a robust alternative to traditional master-slave
configurations.
• Future Considerations: Strategies for dealing with inconsistencies will be explored
further, highlighting the importance of developing effective conflict resolution
policies in P2P systems.
Combining Sharding and Replication (1) 14

• Combination of Strategies: Replication and sharding can be effectively

combined to enhance data management in databases.
• Multiple Masters: Using both master-slave replication and sharding allows for
multiple master nodes, improving scalability and performance.
• Single Master for Data Items: Each data item is assigned a single master,
ensuring clear ownership and update responsibility.
• Flexible Node Roles: Nodes can be configured flexibly, serving as masters for
certain data while acting as slaves for others.
• Dedicated Roles: Alternatively, nodes can be designated specifically for master
or slave roles, optimizing resource allocation and management.
Figure: Using master-slave replication together 15
with sharding
Peer-to-peer replication together with sharding 16

• Common Strategy: Peer-to-peer replication combined with sharding is

frequently used in column-family databases for enhanced data management.
• Cluster Size: This approach can involve tens or even hundreds of nodes within a
cluster, allowing for scalable data distribution.
• Replication Factor: A recommended starting point for peer-to-peer replication is
a replication factor of 3, ensuring that each shard is stored on three different
nodes.
• Node Failure Resilience: If a node fails, the shards that were on that node can be
rebuilt using data from the other nodes in the cluster.
• Data Availability: This replication strategy enhances data availability and fault
tolerance, ensuring that data remains accessible even during node failures.
Figure 2.5. Using peer-to-peer replication 17
together with sharding
Next Class 18

• Module -2 : Chapter 5
Consistency 19

1. Transitioning from relational databases to

NoSQL changes the approach to consistency.
2. Relational databases prioritize strong
consistency, while NoSQL introduces concepts
like the CAP theorem and eventual consistency.
3. Understanding consistency is crucial when
building NoSQL systems, as it impacts design
choices.
4. Consistency encompasses various forms, each
potentially leading to different types of errors.
5. The discussion will include reasons for
relaxing consistency and durability in NoSQL
databases.
CAP theorem 20

• The CAP theorem, proposed by Eric Brewer,

states that in a distributed data store, it is
impossible to simultaneously achieve all three
of the following goals:
• 1. Consistency: All nodes see the same data at
the same time, ensuring that every read
receives the most recent write.
• 2. Availability: Every request receives a
response, regardless of whether it was
successful or not, ensuring that the system is
always operational.
• 3. Partition Tolerance: The system continues to
operate despite network partitions or
communication failures between nodes.
According to the theorem, a distributed system
can only guarantee two of these three properties
at any given time. This means that when
designing a system, trade-offs must be made
based on the specific needs of the application.
Update Consistency 21

• 1. Concurrency Approaches: Consistency maintenance is categorized as

pessimistic or optimistic.

• 2. Pessimistic Approach: Prevents conflicts by using mechanisms like write

locks, allowing only one client to change a value at a time.

• 3. Example of Pessimistic: In a scenario, only the first client (Martin) acquires the
lock and can update, while the second (Pramod) must wait.

• 4. Optimistic Approach: Allows conflicts to occur but detects and resolves them
post-factum, often through conditional updates.

• 5. Example of Optimistic: Martin’s update succeeds while Pramod’s fails,

prompting him to check the value again before retrying.
Update Consistency 22

• 6. Serialization of Updates: Both approaches depend on consistent serialization

of updates across systems.

• 7. Multiple Servers: With multiple servers, different update orders can lead to
discrepancies in data (e.g., different phone numbers).

• 8. Sequential Consistency: A common requirement in distributed systems

ensuring all nodes apply operations in the same order.

• 9. Handling Write-Write Conflicts: An alternative optimistic method saves both

updates and marks them as conflicting.

• 10. Version Control Analogy: This conflict resolution is similar to processes in

distributed version control systems.
Update Consistency 23

11. Merging Updates: Conflicting updates may require user intervention to merge
or automatic handling based on specific rules.

12. Tradeoffs in Concurrency: Pessimistic concurrency may be preferred to avoid

conflicts but can lead to reduced responsiveness.

13. Safety vs. Liveness: There’s a fundamental tradeoff between avoiding errors
(safety) and maintaining quick responses (liveness).

14. Deadlocks: Pessimistic approaches can lead to deadlocks, which are challenging
to prevent and debug.

15. Replication Challenges: Increased replication leads to more write-write

conflicts unless measures are taken to ensure consistency.
Read Consistency 24

1. Update Consistency vs. Read Consistency: A data store can maintain update
consistency, but readers may not always receive consistent data.
2. Logical Consistency:
- Ensures related data items (e.g., order line items and shipping charges) are
consistent.
- Transactions in relational databases prevent read-write conflicts.
3. NoSQL Transactions:
- Claims that NoSQL databases lack transaction support are misleading.
- Aggregate-oriented NoSQL databases allow atomic updates within single
aggregates, not across multiple aggregates.
4. Inconsistency Window:
- Time during which inconsistent reads can occur when updates affect multiple
aggregates. - Example: Amazon’s SimpleDB has a short inconsistency window
(usually <1 second).
A read-write conflict in logical consistency 25
Read Consistency 26

5. Replication Consistency:
- Different replicas may return inconsistent data (e.g., hotel room booking).
- Eventually consistent: all nodes will update to the same value eventually.
6. Impact of Replication on Consistency: Replication can extend logical
inconsistency windows, especially if updates happen rapidly.
7. Configurable Consistency Levels: Applications can specify the desired
consistency level per request (weak or strong).
8. User Experience and Inconsistency: Inconsistencies can confuse users, especially
during simultaneous actions (e.g., booking hotel rooms).
9. Read-Your-Writes Consistency: Guarantees users will see their updates after
they write, often implemented via session consistency.
Replication Consistency Example 27
Read Consistency 28

10. Session Consistency Techniques:

- Sticky Sessions: Ties user sessions to one node for consistency.
- Version Stamps: Ensures data interaction includes the latest version for
consistency.
11. Handling Write Operations:
- Writes can be sent to slaves which forward them to the master while
maintaining session consistency.
- Temporary switches to the master may be necessary during writes.
12. Application Design Considerations:
- Transactions should not remain open during user interactions to avoid conflicts.

This concise summary captures the essence of the complexities surrounding data
consistency in databases, particularly focusing on relational and NoSQL systems.
Relaxing Consistency 29

1. Trade-offs in System Design: Consistency is valuable, but achieving it often

requires sacrificing other system characteristics, necessitating trade-offs.
2. Domain-Specific Tolerances: Different domains have varying tolerances for
inconsistency, which must be considered during system design.

3. Transactions and Isolation Levels: Transactions provide strong consistency

guarantees, but most applications relax isolation levels (e.g., using read-committed)
to enhance performance.

4. Forgoing Transactions for Performance: Many systems avoid transactions due to

their performance costs, as seen with MySQL’s early popularity and eBay's
architecture choices.
5. Interacting with Remote Systems: In enterprise applications, updates often occur
outside of transaction boundaries due to interactions with remote systems that
cannot be included in transactions.
The CAP Theorem 30

1. The CAP theorem states that in distributed systems, you can only achieve two
out of three properties: Consistency, Availability, and Partition tolerance, leading to
necessary trade-offs.
- Consistency: All nodes see the same data at the same time.
- Availability: If a node is reachable, it responds to requests.
- Partition Tolerance: The system continues to operate despite communication
breakdowns.
3. Single-Server vs. Cluster Systems: Single-server systems are naturally CA
(Consistency and Availability) but cannot tolerate partitions. In contrast, cluster
systems must often prioritize Partition tolerance, leading to compromises on
Consistency.
4. Practical Trade-offs: Systems may allow inconsistent writes to enhance
availability, such as in hotel bookings or shopping carts, where some level of
overbooking or merging data may be acceptable.
Partition tolerance 31

5. BASE vs. ACID: NoSQL

systems are often described as
following BASE (Basically
Available, Soft state, Eventual
consistency), but this is seen as
a spectrum rather than a strict
alternative to ACID properties of
relational databases. The focus
should be on the trade-off
between consistency and
latency rather than just
availability.
Relaxing Durability 32

1. ACID Properties and Consistency: Consistency in databases involves serializing

requests into atomic, isolated work units, which is central to ACID properties.
2. Durability Trade-offs: While durability is crucial for data integrity, there are
scenarios where sacrificing some durability can enhance performance, such as
running databases primarily in memory and periodically flushing to disk.
3. Use Cases for Nondurable Writes: User-session state management is a prime
example, where losing session data is less critical than maintaining a responsive
user experience. Durability needs can often be specified on a call-by-call basis.
4. Telemetry Data Collection: For applications like telemetric data capture,
prioritizing speed over durability may be acceptable, accepting the risk of losing the
most recent updates during a server crash.
5. Replication Durability Challenges: Issues arise when a master node fails before
updates are replicated, leading to potential data loss and conflicts upon recovery.
To enhance durability, the master can wait for replicas to acknowledge updates,
though this may slow down processing and impact availability.
Quorums 33

1. Partial Trade-offs: Consistency and durability can be adjusted; involving more

nodes in a request increases the chance of achieving consistency.
2. Write Quorum: To ensure strong consistency, a majority of nodes (more than
half) must acknowledge a write. This is expressed as W > N/2, where W is the
number of nodes confirming a write and N is the total number of nodes.
3. Read Quorum: The number of nodes needed to confirm reads (R) depends on the
write quorum (W). For strong consistency, the relationship R + W > N must hold,
allowing detection of conflicts during reads.
4. Replication Factor: A replication factor of 3 is commonly sufficient for resilience,
allowing one node to fail while maintaining the ability to achieve quorums for
reads and writes.
5. Flexible Strategy: Operations can vary in their quorum requirements based on
consistency and availability needs. This flexibility enables choosing optimal
configurations based on specific use cases, demonstrating that the relationship
between consistency and availability is more nuanced than a simple trade-off.
Next Class 34

• Module -2 : Chapter 3
NoSQL Database (21CS745)

Module -2 : Introduction to NoSQL

Dr. Rama Satish K V

Associate Professor
Department of AI & ML
RNSIT, Bengaluru
Syllabus 36
Version Stamps 37

• NoSQL databases are often criticized for lack of transaction support.

• Transactions help programmers maintain consistency.
• NoSQL proponents argue that aggregate-oriented NoSQL databases support
atomic updates within an aggregate.
• Aggregates are designed to form a natural unit of update.
• Transactional needs should be considered when choosing a database.
• Transactions have limitations, even in transactional systems.
• Some updates require human intervention and can't be run within transactions.
• Long-running transactions are problematic.
• Version stamps can be used to cope with updates requiring human intervention.
• Version stamps are useful in other situations, especially in distributed systems.
• The single-server distribution model is becoming less common.
Business and System Transactions 38

Business Transactions and System Transactions

Challenges with Data Consistency

Optimistic Offline Lock

Version Stamps

Conditional Updates

Additional Uses of Version Stamps

Business Transactions and System Transactions 39

• Business transactions (e.g., browsing a

product, filling in credit card info,
confirming an order) usually do not
occur within a single system transaction
provided by the database.
• System transactions are typically begun
at the end of the user interaction to
avoid long lock periods.
Challenges with Data Consistency 40

• Calculations and decisions may be

based on data that has changed
during the business transaction
(e.g., updated price list, changed
customer address).
• This requires techniques to handle
offline concurrency and ensure data
consistency.
Optimistic Offline Lock 41

• A form of conditional update where the client

rereads information relied on by the business
transaction and checks if it has changed since it was
originally read.

• Uses version stamps: a field that changes every

time the underlying data in the record changes.
Version Stamps 42

- Ensure records in the database contain a version stamp.

- When reading data, note the version stamp to check for changes during updates.
- Techniques for creating version stamps include:
• Counters: Increment on update, easy to compare recentness, require single
master.
• GUIDs: Large random numbers, globally unique, can't compare recentness.
• Content Hashes: Deterministic, globally unique, can't compare recentness.
• Timestamps: Short, can compare recentness, require sync clocks, potential
duplicates.
• Composite Stamps: Combining multiple schemes (e.g., CouchDB uses a
combination of counter and content hash) to blend advantages.
Conditional Updates 43

• Use version stamps to perform conditional updates, ensuring updates are not
based on stale data.
• Similar mechanisms are used in HTTP with etags for resource updates.
• Compare-and-set (CAS) operations can also be used, comparing a version stamp
before setting the new value.
Additional Uses of Version Stamps 44

• Helpful in avoiding update conflicts and providing session consistency.

• Useful in peer-to-peer replication scenarios to spot conflicts if multiple peers

update simultaneously.
Version Stamps on Multiple Nodes 45

• No single authoritative source for version stamps.

• Multiple nodes may provide different answers.
• Difficulty in determining the latest version.
Challenges • Potential for inconsistent updates.

 Counters: Increment on update, works in master-slave

scenarios.
Simple  Timestamps: Difficult to maintain consistent time across
Approaches nodes, prone to issues with clock synchronization.
Advanced Approaches for Peer-to-Peer 46

 Version Histories:
 Requires each node to track version stamp history.
 Clients or server nodes must store and share histories.
 Can detect inconsistencies by checking if one history is an ancestor of another.
 Not commonly used in NoSQL databases.
 Vector Stamps:
 A set of counters, one for each node.
 Each node updates its own counter on internal updates.
 Nodes synchronize their vector stamps during communication.
 Allows for determining newer versions: all counters in the newer stamp are greater than or equal to
those in the older stamp.
 Detects write-write conflicts when both stamps have counters greater than the other.
 Missing values are treated as 0, allowing easy addition of new nodes.
 Helps spot inconsistencies but does not resolve them.
Next Class 47

• Module 1 and Module -2

Noida Software Companies - List of IT Companies in Noida
No ratings yet
Noida Software Companies - List of IT Companies in Noida
8 pages
AWS Certified Database Specialty Exam Dumps 2021
No ratings yet
AWS Certified Database Specialty Exam Dumps 2021
10 pages
How To Crack Vray For Sketchup 1.49.01
67% (3)
How To Crack Vray For Sketchup 1.49.01
1 page
Ch02 - Big Data Storage Concepts
No ratings yet
Ch02 - Big Data Storage Concepts
23 pages
Project 6072
No ratings yet
Project 6072
28 pages
Distribution Model
100% (1)
Distribution Model
24 pages
Menu Lua API
No ratings yet
Menu Lua API
6 pages
Ip Nokia
No ratings yet
Ip Nokia
4 pages
Red Cokkies
No ratings yet
Red Cokkies
113 pages
Closing Meeting PPT Sample ISO27001
100% (2)
Closing Meeting PPT Sample ISO27001
15 pages
Setting Up System List: Make Sure Macro Is Enabled When Opening The File
No ratings yet
Setting Up System List: Make Sure Macro Is Enabled When Opening The File
4 pages
DWM Exp 1-2
No ratings yet
DWM Exp 1-2
9 pages
User Manual of Mobile App NSDL
No ratings yet
User Manual of Mobile App NSDL
13 pages
Managing Bacula Using Webmin
No ratings yet
Managing Bacula Using Webmin
10 pages
Active Directory Exploitation Cheat Sheet by S1Ckb0Y1337
100% (1)
Active Directory Exploitation Cheat Sheet by S1Ckb0Y1337
28 pages
Working With Database Tables Student Version
No ratings yet
Working With Database Tables Student Version
12 pages
Clint Boessen's Blog - Remote COM+ Network Access To Server 2012 Core
No ratings yet
Clint Boessen's Blog - Remote COM+ Network Access To Server 2012 Core
4 pages
Bus Management System Project Idea For Final Year
100% (1)
Bus Management System Project Idea For Final Year
10 pages
UBS Inventory Software
No ratings yet
UBS Inventory Software
2 pages
On The Selection of M For Fuzzy C-Means
No ratings yet
On The Selection of M For Fuzzy C-Means
7 pages
Python For The Oracle DBA: A Taste of What's Cooking at US Foods
No ratings yet
Python For The Oracle DBA: A Taste of What's Cooking at US Foods
27 pages
04 Surveys Cattell PDF
No ratings yet
04 Surveys Cattell PDF
16 pages
Human Resource Management System: Santhosh Ahul Veer
No ratings yet
Human Resource Management System: Santhosh Ahul Veer
75 pages
Big Data - No SQL Databases and Related Concepts
100% (1)
Big Data - No SQL Databases and Related Concepts
101 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
No ratings yet
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
28 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
No ratings yet
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
27 pages
Chap 2 Emerging Database Landscape
No ratings yet
Chap 2 Emerging Database Landscape
10 pages
How To Earn Ipv6 Certifications (On Windows, Part 1)
No ratings yet
How To Earn Ipv6 Certifications (On Windows, Part 1)
30 pages
Consistency
No ratings yet
Consistency
42 pages
Chapter24 Nosql Dbs
No ratings yet
Chapter24 Nosql Dbs
35 pages
Nosql Data Management
No ratings yet
Nosql Data Management
13 pages
Technical Design: Version Effective Date
No ratings yet
Technical Design: Version Effective Date
8 pages
Sharding in MongoDB
No ratings yet
Sharding in MongoDB
3 pages
A Thorough Introduction To Distributed Systems
No ratings yet
A Thorough Introduction To Distributed Systems
31 pages
Apache Cassandra
No ratings yet
Apache Cassandra
15 pages
Big Data Management Basic Principles
No ratings yet
Big Data Management Basic Principles
55 pages
DBMS 3-Unit
No ratings yet
DBMS 3-Unit
15 pages
ECS781P-9-Cloud Data Management
No ratings yet
ECS781P-9-Cloud Data Management
79 pages
BDT Assignment
No ratings yet
BDT Assignment
4 pages
Assignment Systems2023
100% (3)
Assignment Systems2023
11 pages
Exploiting Dynamic Resource Allocation For Efficient Parallel Data Processing in Cloud-By Using Nephel's Algorithm
No ratings yet
Exploiting Dynamic Resource Allocation For Efficient Parallel Data Processing in Cloud-By Using Nephel's Algorithm
3 pages
Big Data Storage Concepts
No ratings yet
Big Data Storage Concepts
31 pages
AD Utilities Oracle Apps 11i
No ratings yet
AD Utilities Oracle Apps 11i
3 pages
Maltego Person of Interest Investigations Cheat Sheet
No ratings yet
Maltego Person of Interest Investigations Cheat Sheet
1 page
2 NoSQL Databases Principles
No ratings yet
2 NoSQL Databases Principles
58 pages
Unit 2
No ratings yet
Unit 2
64 pages
NoSQL - Unit2
No ratings yet
NoSQL - Unit2
8 pages
Nosql Overview: Implementation Free
No ratings yet
Nosql Overview: Implementation Free
40 pages
IBM Format - Jayant Singh - AWS - AZURE - Resume
No ratings yet
IBM Format - Jayant Singh - AWS - AZURE - Resume
3 pages
3 Bda Chapter3 Answer
No ratings yet
3 Bda Chapter3 Answer
7 pages
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
No ratings yet
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
13 pages
DBMS Module-5 2024 Chap 2
No ratings yet
DBMS Module-5 2024 Chap 2
25 pages
BDA CH 2 (StorageConcepts)
No ratings yet
BDA CH 2 (StorageConcepts)
33 pages
Module 3
No ratings yet
Module 3
14 pages
NoSQL M1
No ratings yet
NoSQL M1
48 pages
6q9k5yndkd9j-SDE DF400 020 Full Deck
No ratings yet
6q9k5yndkd9j-SDE DF400 020 Full Deck
81 pages
DRKP Module 2 1
No ratings yet
DRKP Module 2 1
77 pages
No SQL Ia-01 - Micro
No ratings yet
No SQL Ia-01 - Micro
6 pages
Module 2 Nosql
No ratings yet
Module 2 Nosql
10 pages
NoSQL - Unit 2
No ratings yet
NoSQL - Unit 2
11 pages
Nosql M2-P1-P2
No ratings yet
Nosql M2-P1-P2
75 pages
Bda Ia2 Bda
No ratings yet
Bda Ia2 Bda
7 pages
Module 2
No ratings yet
Module 2
36 pages
III Sharding Strategies
No ratings yet
III Sharding Strategies
30 pages
NoSQL Module 2
No ratings yet
NoSQL Module 2
76 pages
NoSQL Databases UNIT-2
No ratings yet
NoSQL Databases UNIT-2
29 pages
REPLICATION
No ratings yet
REPLICATION
20 pages
Nosql Mod2
No ratings yet
Nosql Mod2
25 pages
How To Ace Any SQL Interview
No ratings yet
How To Ace Any SQL Interview
12 pages
Nosql 1
No ratings yet
Nosql 1
40 pages
Mod5 CH2
No ratings yet
Mod5 CH2
36 pages
Unit 5 NOSQL
No ratings yet
Unit 5 NOSQL
102 pages
Class 7 - Scaling, Sharding, Consistent Hashing
No ratings yet
Class 7 - Scaling, Sharding, Consistent Hashing
4 pages
Revealing The Promise of Predictive Analytics in Management Accounting A Strategic Analysis of The Value-Added Potential
No ratings yet
Revealing The Promise of Predictive Analytics in Management Accounting A Strategic Analysis of The Value-Added Potential
206 pages
NoSql Module 2 Part 1
No ratings yet
NoSql Module 2 Part 1
13 pages
No SQL
No ratings yet
No SQL
14 pages
Module 2 Nosql
No ratings yet
Module 2 Nosql
31 pages
Nosql Databases
No ratings yet
Nosql Databases
379 pages
An Unorthodox Approach To Database Design - The Coming of The Shard
No ratings yet
An Unorthodox Approach To Database Design - The Coming of The Shard
6 pages
CH-07 Replication
No ratings yet
CH-07 Replication
35 pages
Lec 3 - Basic Concepts
No ratings yet
Lec 3 - Basic Concepts
32 pages
Mathina BDA
No ratings yet
Mathina BDA
11 pages
Gcru 2 Nosql
No ratings yet
Gcru 2 Nosql
52 pages
Lecture 3 - Principles of NoSQL Databases
No ratings yet
Lecture 3 - Principles of NoSQL Databases
49 pages
Module 2
No ratings yet
Module 2
40 pages
NOSQL Databases and Big Data Storage Systems: Shilpa R Assistant Professor Cse, Sdmit Ujire
No ratings yet
NOSQL Databases and Big Data Storage Systems: Shilpa R Assistant Professor Cse, Sdmit Ujire
61 pages
Module 5 Part II NoSQL DB
No ratings yet
Module 5 Part II NoSQL DB
12 pages

NoSQL M2

Uploaded by

NoSQL M2

Uploaded by

NoSQL Database (21CS745)

Module -2 : Introduction to NoSQL

Dr. Rama Satish K V

• Simplicity of Single-Server Model: The simplest and often recommended

Definition of Sharding: Sharding is a

• Definition of Sharding: Sharding is a technique for horizontal scalability where

Definition :This method

• Network Traffic Trade-off: Ensuring coordinated writes can increase network

• Combination of Strategies: Replication and sharding can be effectively

• Common Strategy: Peer-to-peer replication combined with sharding is

1. Transitioning from relational databases to

• The CAP theorem, proposed by Eric Brewer,

• 1. Concurrency Approaches: Consistency maintenance is categorized as

• 2. Pessimistic Approach: Prevents conflicts by using mechanisms like write

• 5. Example of Optimistic: Martin’s update succeeds while Pramod’s fails,

• 6. Serialization of Updates: Both approaches depend on consistent serialization

• 8. Sequential Consistency: A common requirement in distributed systems

• 9. Handling Write-Write Conflicts: An alternative optimistic method saves both

• 10. Version Control Analogy: This conflict resolution is similar to processes in

12. Tradeoffs in Concurrency: Pessimistic concurrency may be preferred to avoid

15. Replication Challenges: Increased replication leads to more write-write

10. Session Consistency Techniques:

1. Trade-offs in System Design: Consistency is valuable, but achieving it often

3. Transactions and Isolation Levels: Transactions provide strong consistency

4. Forgoing Transactions for Performance: Many systems avoid transactions due to

5. BASE vs. ACID: NoSQL

1. ACID Properties and Consistency: Consistency in databases involves serializing

1. Partial Trade-offs: Consistency and durability can be adjusted; involving more

Module -2 : Introduction to NoSQL

Dr. Rama Satish K V

• NoSQL databases are often criticized for lack of transaction support.

Business Transactions and System Transactions

Challenges with Data Consistency

Optimistic Offline Lock

Additional Uses of Version Stamps

• Business transactions (e.g., browsing a

• Calculations and decisions may be

• A form of conditional update where the client

• Uses version stamps: a field that changes every

- Ensure records in the database contain a version stamp.

• Helpful in avoiding update conflicts and providing session consistency.

• Useful in peer-to-peer replication scenarios to spot conflicts if multiple peers

• No single authoritative source for version stamps.

 Counters: Increment on update, works in master-slave

• Module 1 and Module -2

You might also like