NoSQL Lecture Notes Compilation
NoSQL Lecture Notes Compilation
Version)
Lecture 11: NoSQL Databases and Document-oriented
Databases
Topics Covered:
Definition: NoSQL stands for "Non-SQL" or "Not Only SQL." It represents an alternative
approach to database design that moves away from the traditional relational model. NoSQL
databases are designed to handle large volumes of data and to scale horizontally across
distributed systems.
Traditional relational databases can struggle to meet the demands of modern applications such as
social media platforms, e-commerce websites, and real-time analytics systems. The key
motivations for adopting NoSQL databases include:
Big Data: NoSQL databases can handle massive amounts of data distributed across
different systems.
Scalability: NoSQL databases allow for horizontal scaling by adding more nodes to the
system.
Cost Efficiency: Many NoSQL databases are open-source and can be deployed on
commodity hardware.
Flexibility: NoSQL databases can handle data with varying structures, which is essential
for modern applications.
Availability: NoSQL databases are designed to be highly available, ensuring continuous
operation even if some nodes go down.
3. Distributed Databases
Interconnected Nodes: The databases are stored on different nodes (servers) that are
interconnected by a network.
Logical Interrelation: The data across these nodes must be logically related.
Heterogeneity: The nodes may differ in terms of hardware, software, and data structures.
4.1 Scalability
NoSQL databases can scale horizontally by adding more nodes to the system without
interrupting operations. This allows them to handle growing datasets efficiently.
NoSQL databases provide high availability by replicating data across multiple nodes. If one node
fails, other nodes can take over, ensuring uninterrupted service.
4.3 Sharding
Sharding involves splitting a large database into smaller, more manageable parts called shards.
Each shard is stored on a separate node, distributing the load and improving performance.
4.4 High-performance Data Access
NoSQL databases use partition keys to quickly locate and retrieve data from distributed nodes,
optimizing read and write operations.
5. Document-Oriented Databases
Key Concepts:
Data Formats:
JSON is an open standard format used for data exchange. It is language-independent and widely
used in web applications.
JSON Syntax:
{
"ProjectID": "P001",
"ProjectName": "Database Migration",
"Workers": [
{"WorkerID": "W001", "Name": "John Doe"},
{"WorkerID": "W002", "Name": "Jane Smith"}
]
}
8.1 Create:
8.2 Read:
8.3 Update:
db.collection_name.update(
{ "_id": "P001" },
{ $set: { "name": "Project Beta" } }
);
8.4 Delete:
In this design, each project document contains an array of workers within it. This is a
denormalized approach that reduces the need for joins.
Here, the project document contains an array of worker IDs, with separate documents for each
worker in another collection. This approach is more normalized.
In this approach, projects and workers are stored in separate collections, with references to link
them. This is similar to a many-to-many relationship in relational databases.