NoSQL Lecture Notes Compilation (Detailed
Version)
Lecture 11: NoSQL Databases and Document-oriented
Databases
Topics Covered:
The NoSQL Paradigm: Definition, Motivations, and Need for NoSQL
Distributed Databases
Features of NoSQL Databases
Document-oriented Databases
JSON and MongoDB
CRUD Operations in MongoDB
1. The NoSQL Database Paradigm
Definition: NoSQL stands for "Non-SQL" or "Not Only SQL." It represents an alternative
approach to database design that moves away from the traditional relational model. NoSQL
databases are designed to handle large volumes of data and to scale horizontally across
distributed systems.
Key Features of NoSQL Databases:
Non-relational: Data is stored in a format other than traditional relational tables.
Distributed: Data is replicated and distributed across multiple servers.
Scalable: The system can handle growing amounts of data and users by adding more
servers (nodes).
Flexible: No fixed schema is required, making it suitable for semi-structured or
unstructured data.
2. Motivations for NoSQL Databases
Traditional relational databases can struggle to meet the demands of modern applications such as
social media platforms, e-commerce websites, and real-time analytics systems. The key
motivations for adopting NoSQL databases include:
Big Data: NoSQL databases can handle massive amounts of data distributed across
different systems.
Scalability: NoSQL databases allow for horizontal scaling by adding more nodes to the
system.
Cost Efficiency: Many NoSQL databases are open-source and can be deployed on
commodity hardware.
Flexibility: NoSQL databases can handle data with varying structures, which is essential
for modern applications.
Availability: NoSQL databases are designed to be highly available, ensuring continuous
operation even if some nodes go down.
3. Distributed Databases
A distributed database is a collection of multiple logically interrelated databases distributed
across a computer network. In a distributed database system:
Interconnected Nodes: The databases are stored on different nodes (servers) that are
interconnected by a network.
Logical Interrelation: The data across these nodes must be logically related.
Heterogeneity: The nodes may differ in terms of hardware, software, and data structures.
Advantages of Distributed Databases:
Improved availability and fault tolerance.
Faster query processing by distributing the load.
Scalability to handle large datasets.
4. Features of NoSQL DBMS
4.1 Scalability
NoSQL databases can scale horizontally by adding more nodes to the system without
interrupting operations. This allows them to handle growing datasets efficiently.
4.2 Availability and Replication
NoSQL databases provide high availability by replicating data across multiple nodes. If one node
fails, other nodes can take over, ensuring uninterrupted service.
4.3 Sharding
Sharding involves splitting a large database into smaller, more manageable parts called shards.
Each shard is stored on a separate node, distributing the load and improving performance.
4.4 High-performance Data Access
NoSQL databases use partition keys to quickly locate and retrieve data from distributed nodes,
optimizing read and write operations.
5. Document-Oriented Databases
Document-oriented databases store data as collections of documents. Each document is a self-
contained unit that includes both the data and its structure. This approach offers flexibility in
handling different types of data.
Key Concepts:
Documents: Self-describing units that contain data and metadata.
Collections: Groups of similar documents.
Name-Value Pairs: The basic structure of a document, where each attribute (name) is
associated with a value.
Data Formats:
JSON (JavaScript Object Notation): A popular format for representing documents.
BSON (Binary JSON): A binary-encoded version of JSON used by MongoDB for faster
processing.
6. JSON (JavaScript Object Notation)
JSON is an open standard format used for data exchange. It is language-independent and widely
used in web applications.
JSON Syntax:
Documents consist of name-value pairs.
Names are strings enclosed in double quotes.
Values can be numbers, strings, arrays, objects, or null.
Example JSON Document:
{
"ProjectID": "P001",
"ProjectName": "Database Migration",
"Workers": [
{"WorkerID": "W001", "Name": "John Doe"},
{"WorkerID": "W002", "Name": "Jane Smith"}
]
}
7. MongoDB: A Document-Based DBMS
MongoDB is a cross-platform, document-oriented database management system. It uses BSON
to store data and offers flexibility in handling varying data structures.
Key Features of MongoDB:
Schema-less: Fields can vary between documents.
Distributed: Data is replicated across multiple servers.
JSON-like Documents: Data is stored in collections of BSON documents.
Querying: MongoDB allows querying nested documents by specific keys.
8. CRUD Operations in MongoDB
MongoDB supports basic CRUD operations:
8.1 Create:
The insert operation is used to add new documents to a collection.
db.collection_name.insert({ "_id": "P001", "name": "Project Alpha" });
8.2 Read:
The find operation retrieves documents from a collection based on a condition.
db.collection_name.find({ "name": "Project Alpha" });
8.3 Update:
The update operation modifies existing documents.
db.collection_name.update(
{ "_id": "P001" },
{ $set: { "name": "Project Beta" } }
);
8.4 Delete:
The remove operation deletes documents that match a condition.
db.collection_name.remove({ "_id": "P001" });
9. MongoDB Design Examples
Design 1: Project Document with Embedded Workers
In this design, each project document contains an array of workers within it. This is a
denormalized approach that reduces the need for joins.
Design 2: Project Document with Embedded Worker IDs
Here, the project document contains an array of worker IDs, with separate documents for each
worker in another collection. This approach is more normalized.
Design 3: Normalized Design
In this approach, projects and workers are stored in separate collections, with references to link
them. This is similar to a many-to-many relationship in relational databases.
10. References and Essential Readings
1. Elmasri, R. & Navathe, S. (2017). Fundamentals of Database Systems. 7th Edition.
Pearson Education.
2. IBM Cloud Education (2019). NoSQL Databases. Available at:
https://www.ibm.com/cloud/learn/nosql-databases
3. MongoDB (2021). MongoDB Manual. Available at:
https://docs.mongodb.com/manual/introduction/
4. Sullivan, D. (2015). NoSQL for Mere Mortals. Addison-Wesley Professional.