Module-2 NOSQL
Module-2 NOSQL
A) They are tools used to track changes in data and detect if multiple people or systems are
trying to modify the same data simultaneously (concurrency).
1. Counter:
● You can use a counter, always incrementing it when you update the resource.
Counters are useful since they make it easy to tell if one version is more recent
than another.
2. GUID (Globally Unique Identifier):
● Another approach is to create a GUID, a large random number that’s guaranteed
to be unique.
3. Hash:
● A third approach is to make a hash of the contents of the resource. With a big
enough hash key size, a content hash can be globally unique like a GUID and
can also be generated by anyone;
4. Timestamp:
○ A fourth approach is to use the timestamp of the last update. Like counters, they
are reasonably short and can be directly compared for recentness, yet have the
advantage of not needing a single master
5. Composite Stamp:
● You can blend the advantages of these different version stamp schemes by using
more than one of them to create a composite stamp.
2) What are distribution models? Explain briefly two paths of data distribution
Distribution model scale up or scale out. Depending on distribution model we can get data store
which gives ability to handle larger quantities of data.
1. Sharding
2. Replication
1) Sharding: Often, a busy data store is busy because different people are accessing
different parts of the dataset. In these circumstances we can support horizontal
scalability by putting different parts of the data onto different servers—a technique that’s
called sharding
2) Replication:
A)
Update Consistency: This deals with problems that arise when multiple users or processes try
to update the same piece of data simultaneously.
● Write-write conflict: This occurs when two or more updates happen at the same time,
potentially overwriting each other's changes.
● Serialize: The process of ordering the updates so they happen one after another,
preventing direct conflicts.
● Lost update: A situation where one update is overwritten by another, resulting in the first
update being lost.
● Pessimistic & optimistic approach: These are two main strategies for handling
concurrent updates:
● Conditional update: A type of optimistic approach where an update is only applied if the
data hasn't changed since it was last read.
● Write locks: Locks used in pessimistic concurrency to prevent multiple processes from
writing to the same data simultaneously.
Read Consistency: This focuses on ensuring that readers see consistent data, even when
updates are happening concurrently.
● Inconsistent read (or read-write conflict): This occurs when a reader sees data that is
in the middle of being updated, resulting in inconsistent or partially updated information.
● Logical consistency: Ensuring that related pieces of data are consistent with each
other. For example, an order total should match the sum of its items.
● Inconsistency window: The time period during which data might be inconsistent due to
concurrent updates.
● Replication consistency: In distributed systems with data replicated across multiple
servers, this ensures that all replicas have the same data.
5) CAP Theorm