0% found this document useful (0 votes)

4 views12 pages

Nosql Mod4

Document databases store and retrieve documents in formats like XML and JSON, allowing for schema flexibility and dynamic data representation. Key features include self-describing hierarchical structures, handling missing data, and support for embedding child documents. Popular document databases include MongoDB, CouchDB, and RavenDB, with MongoDB serving as a representative example for its features, consistency, transactions, and scaling capabilities.

Uploaded by

Prerana S A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views12 pages

Nosql Mod4

Uploaded by

Prerana S A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

MODULE 4

DOCUMENT DATABASES

1. Documents:

o The primary concept in document databases.

o Stores and retrieves documents in formats like XML, JSON, BSON, etc.

o Documents are self-describing, hierarchical tree structures consisting of

maps, collections, and scalar values.

o Documents are similar to each other but not required to be identical.

2. Storage:

o Documents are stored in the value part of a key-value store.

o Document databases can be viewed as key-value stores where the value (the
document) is examinable.

3. Terminology Comparison (Oracle vs MongoDB):

o _id in MongoDB:

▪ A special field found in all documents.

▪ Similar to ROWID in Oracle.

▪ _id can be user-assigned, as long as it remains unique.

o ROWID in Oracle:

▪ Serves a similar function as MongoDB’s _id field.

9.1 WHAT IS A DOCUMENT DATABASE?

The above document can be considered a row in a traditional

RDBMS. Let’s look at another document:
1. Key Features:

o Schema Flexibility:

▪ Documents in a collection can have different attribute names and

structures.

▪ No fixed schema like in traditional RDBMS, where every row in a table

must follow the same schema.

▪ Example: One document has an addresses field, while another has a

likes field.

o Data Representation:

▪ Attributes can vary between documents, e.g., some documents may

have a likes field, while others may not.

▪ Embedding:

▪ Child documents (e.g., addresses) can be embedded inside the

main document for easier access and better performance.

o Handling Missing Data:

▪ If an attribute is missing, it is assumed not relevant, unlike RDBMS

where missing data is set to null or empty.

o Dynamic Schema:

▪ New attributes can be added to documents without the need to

define them or modify existing documents.

2. Popular Document Databases:

MongoDB , CouchDB , Terrastore , OrientDB , RavenDB , Lotus Notes (uses document

storage)
9.2 FEATURES

• MongoDB as a representative of document databases: While there are many

specialized document databases, MongoDB is used as a representative to explain
features.

• MongoDB structure:

o A MongoDB instance can have multiple databases.

o Each database can contain multiple collections.

• Comparison with RDBMS:

o An RDBMS instance is analogous to a MongoDB instance.

o Schemas in RDBMS are similar to MongoDB databases.

o RDBMS tables are equivalent to MongoDB collections.

• Storing documents in MongoDB:

o When storing a document, you need to specify which database and collection
it belongs to.

o Example: database.collection.insert(document) or db.coll.insert(document).

I understand you're frustrated, and I apologize for not including the code snippet in the
detailed summary. Here's the complete summary of your content, with every detail
included, including the code you provided:

9.2.1 Consistency in MongoDB:

• Replica Sets for Consistency:

o MongoDB achieves consistency by using replica sets, which replicate writes to

multiple servers.

o The number of servers to which a write must be propagated is configurable.

o A write can be considered successful only after being propagated to a certain

number of servers.

• Example Command for Consistency:

o Command: db.runCommand({ getlasterror : 1 , w : "majority" })

o The w parameter specifies how many nodes must confirm the write before
it’s successful.

▪ For example:
▪ If there is one node and w is "majority", the write is successful
immediately.

▪ If there are three nodes and w is "majority", the write must

complete on at least two nodes.

• Impact of Consistency Settings on Write Performance:

o Stronger consistency (higher w value) leads to slower write performance as

more nodes need to confirm the write.

• Increasing Read Performance:

o MongoDB allows reading from secondary (slave) nodes by setting the slaveOk
parameter.

o The slaveOk parameter can be set at the connection, database, collection, or

operation level.

• Example Code for Read Consistency:

Mongo mongo = new Mongo("localhost:27017");

mongo.slaveOk();

o This sets slaveOk for the MongoDB connection.

• Example Code for Query with Slave Read:

DBCollection collection = getOrderCollection();

BasicDBObject query = new BasicDBObject();

query.put("name", "Martin");

DBCursor cursor = collection.find(query).slaveOk();

o In this example:

▪ A query is created to find documents with the name "Martin."

▪ The query uses slaveOk() to allow reading from a slave node.

• WriteConcern for Write Consistency:

o WriteConcern controls the consistency level for write operations.

o By default, a write is considered successful once the database receives it.

o You can configure WriteConcern to wait for writes to sync to disk or

propagate to multiple nodes.

• Example Code for Setting WriteConcern:

DBCollection shopping = database.getCollection("shopping");

shopping.setWriteConcern(REPLICAS_SAFE);

o This sets the WriteConcern to REPLICAS_SAFE, ensuring that writes are

propagated to both the master and at least one slave.

• Setting WriteConcern per Operation:

WriteResult result = shopping.insert(order, REPLICAS_SAFE);

o This ensures that the write operation is replicated safely across the nodes.

• Trade-offs in Consistency Settings:

o The choice between read performance (slaveOk) and write consistency

(WriteConcern) should be made based on application needs and business
requirements.

9.2.2 Transactions in MongoDB:

• Traditional RDBMS Transactions:

o In traditional RDBMS, transactions allow modifications to multiple tables

using commands like insert, update, or delete.

o After making changes, you can decide to either commit (keep) or rollback
(discard) the changes.

• Transactions in NoSQL (MongoDB):

o In NoSQL systems like MongoDB, traditional transactions (spanning multiple

operations) are not available.

o MongoDB only supports atomic transactions at the single-document level.

This means:

▪ A write either succeeds or fails at the document level.

▪ There is no concept of commit or rollback for operations spanning

multiple documents or collections.

▪ However, some NoSQL products, like RavenDB, do support

transactions across multiple operations.

• Write Concern for Fine Control:

o MongoDB provides a way to control write operations' success using the

WriteConcern parameter.
o By default, all writes are considered successful as soon as they are received
by the database.

o You can configure WriteConcern to ensure the write is propagated to more

than one node before being reported as successful.

▪ For example, setting WriteConcern.REPLICAS_SAFE ensures the write

is propagated to the primary and at least one secondary node before
reporting success.

▪ Different levels of WriteConcern provide varying safety levels:

▪ WriteConcern.NONE is used for the lowest safety level,

suitable for less critical operations like writing log entries.

• Example Code for Transactions with Write Concern:

final Mongo mongo = new Mongo(mongoURI);

mongo.setWriteConcern(REPLICAS_SAFE);

DBCollection shopping = mongo.getDB(orderDatabase)

.getCollection(shoppingCollection);

try {

WriteResult result = shopping.insert(order, REPLICAS_SAFE);

// Writes made it to primary and at least one secondary

} catch (MongoException writeException) {

// Writes did not make it to minimum of two nodes including primary

dealWithWriteFailure(order, writeException);

o In this code:

▪ The MongoDB connection is configured with

WriteConcern.REPLICAS_SAFE to ensure the write is successful only
when the write reaches the primary and at least one secondary node.

▪ If the write operation fails to propagate to the required nodes, a

MongoException is caught, and the failure is handled by the
dealWithWriteFailure() method.

9.2.3 Availability in MongoDB:

• CAP Theorem:
o The CAP theorem states that a distributed database can achieve only two out
of the three properties: Consistency, Availability, and Partition Tolerance.

o MongoDB focuses on availability by using data replication, ensuring data

remains accessible even when the primary node is down.

• Replica Sets in MongoDB:

o MongoDB uses replica sets for high availability. A replica set consists of
multiple nodes where one is the primary (master) node, and others are
secondary (slave) nodes.

o Master-Slave Replication: The primary node handles all write requests, and
the data is asynchronously replicated to the secondary nodes.

o If the primary node fails, the secondary nodes automatically elect a new
primary. Future requests are routed to the newly elected primary.

o When the failed node comes back online, it rejoins the replica set as a
secondary and catches up with the rest of the nodes by pulling the missing
data.

• Priority Assignment:

o Nodes in the replica set can have different voting rights. Nodes can be
assigned a priority (a number between 0 and 1000) to influence the election
of the primary node.

o For example, nodes in the primary data center can be assigned a higher
priority to ensure they are elected as the primary node.

• Automatic Node Discovery:

o When an application connects to a replica set, it only needs to connect to one

node (whether primary or secondary).

o The application automatically discovers the other nodes in the replica set.

o If the primary node fails, the MongoDB driver will automatically connect to
the newly elected primary node, and the application does not need to handle
node selection or failure recovery.

• Use Cases for Replica Sets:

o Data Redundancy: Ensures that data is available on multiple nodes,

preventing data loss.

o Automated Failover: Automatically elects a new primary if the current

primary node fails.
o Read Scaling: Distributes read requests across secondary nodes to reduce the
load on the primary node.

o Server Maintenance Without Downtime: Allows for maintenance of servers

without interrupting service, as secondary nodes can handle requests during
maintenance.

o Disaster Recovery: Ensures data remains accessible and recoverable in case

of disasters.

• Comparison with Other Products:

o Similar availability setups using replication and failover mechanisms are found
in other products like CouchDB, RavenDB, and Terrastore.

9.2.4 Query Features in Document Databases:

• CouchDB Querying:

o Views: CouchDB uses views for querying documents. Views can be:

▪ Materialized Views: Precomputed results stored in the database.

▪ Dynamic Views: Computed at runtime using map-reduce functions.

o Example: For aggregating reviews and calculating the average rating, you can
create a view that performs the count and average calculations.

o Materialized Views: Precompute values to avoid recalculating for every

request. They are updated when queried, reflecting any changes in data.

• Advantages of Document Databases Over Key-Value Stores:

o Unlike key-value stores, document databases allow querying the data within
the document without retrieving the entire document by its key.

o This brings document databases closer to the relational database query

model.
• MongoDB Query Language:

o MongoDB’s query language is expressed using JSON.

o Some common constructs in MongoDB queries:

▪ $query: For the WHERE clause.

▪ $orderby: For sorting data.

▪ $explain: To view the execution plan of the query.

o MongoDB provides many other constructs that can be combined for creating
complex queries.

• MongoDB Query Examples:

o Fetching All Documents:

▪ SQL: SELECT * FROM order

▪ MongoDB: db.order.find()

o Fetching Orders for a Specific Customer:

▪ SQL: SELECT * FROM order WHERE customerId = "883c2c5b4e5b"

▪ MongoDB: db.order.find({"customerId":"883c2c5b4e5b"})

o Selecting Specific Fields for a Customer:

▪ SQL: SELECT orderId, orderDate FROM order WHERE customerId =

"883c2c5b4e5b"

▪ MongoDB: db.order.find({customerId:"883c2c5b4e5b"},{orderId:1,
orderDate:1})

o Querying Embedded Documents:

▪ Example: Querying orders where an item has a product name like

"Refactoring".

▪ SQL:

SELECT * FROM customerOrder, orderItem, product

WHERE customerOrder.orderId = orderItem.customerOrderId

AND orderItem.productId = product.productId

AND product.name LIKE '%Refactoring%'

MongoDB: db.orders.find({"items.product.name":/Refactoring/})
▪ Advantage: MongoDB queries are simpler because data is embedded
in documents, allowing direct querying of child objects.

9.2.5 SCALING IN DOCUMENT DATABASES:

• Scaling Concept:

o Scaling involves adding nodes or changing data storage to handle more load,
without migrating the database to a larger server.

o The focus is on database features that support increased load rather than
modifying the application itself.

• Scaling for Heavy-Read Loads:

o Horizontal Scaling for Reads:

▪ Achieved by adding more read slaves (secondary nodes) to a replica

set.

▪ For a 3-node replica set, more slave nodes can be added as the read
load increases.

▪ The slaveOk flag allows read operations to be directed to the slave

nodes.

▪ Adding a Node:

▪ New node is added with rs.add("mongod:27017").

▪ The new node syncs with existing nodes, joins as a secondary

node, and starts serving read requests.

▪ Advantages:

▪ No need to restart other nodes.

▪ No downtime for the application.

• Scaling for Writes:

o Sharding:
▪ Sharding splits data based on a certain field (e.g., state or year), and
the data is moved across different Mongo nodes.

▪ This allows for horizontal scaling for writes.

▪ Sharding Command:

▪ db.runCommand({ shardcollection: "ecommerce.customer", key: {

firstname: 1 } })

▪ The data is split based on the specified key (e.g., first name) to ensure
balanced distribution across the shards.

▪ As more nodes are added, the number of writable nodes increases,

providing better scalability for write operations.

• Sharding and Replica Sets:

o Each shard can be a replica set to improve read performance within the
shard.

o As new shards are added, data is rebalanced across the shards.

o Zero Downtime: The application does not experience downtime, although

performance may temporarily decrease while data is being moved to
rebalance shards.

• Shard Key Importance:

o The choice of shard key is critical to data distribution and performance.

o Geographical Sharding:

▪ Sharding can be done based on user location, such as East Coast or

West Coast, ensuring that data is served from the closest shards for
faster access.

9.3 SUITABLE USE CASES FOR DOCUMENT DATABASES:

• 9.3.1 Event Logging:

o Document databases are ideal for storing various types of event data across
different applications.

o They serve as a central data store for event logging, especially when the data
captured by events keeps changing.

o Events can be sharded by the application name or event type (e.g.,

order_processed, customer_logged).

• 9.3.2 Content Management Systems, Blogging Platforms:

o Document databases are suitable for content management systems and
blogging platforms because:

▪ They don’t have predefined schemas, allowing flexibility in storing

various types of data (e.g., user comments, profiles).

▪ They typically support JSON documents, which align well with web-
based content management and publishing.

• 9.3.3 Web Analytics or Real-Time Analytics:

o Document databases are effective for storing real-time analytics data.

o They support easy updates to parts of the document, making it ideal for
tracking metrics like page views or unique visitors.

o New metrics can be added easily without the need for schema changes.

• 9.3.4 E-Commerce Applications:

o E-commerce applications benefit from document databases due to their

flexible schema.

o They are useful for storing product and order information, allowing data
models to evolve without expensive database refactoring or data migration.

9.4 WHEN NOT TO USE DOCUMENT DATABASES:

• 9.4.1 Complex Transactions Spanning Different Operations:

o Document databases are not ideal for scenarios requiring atomic operations
across multiple documents.

o If cross-document transactions are necessary, document databases may not

be suitable.

o However, some document databases, like RavenDB, support these types of

operations.

• 9.4.2 Queries against Varying Aggregate Structure:

o Document databases offer flexible schemas, meaning they don’t enforce

schema restrictions.

o This flexibility can cause issues if you need to query ad hoc or make queries
where the structure of the data keeps changing.

o If the design of aggregates is constantly changing, the data may need to be

stored at a lower granularity (normalized data), which could lead to
inefficiency in document databases.

A Critical Analysis On The Security Concerns of Internet of Things Iot
No ratings yet
A Critical Analysis On The Security Concerns of Internet of Things Iot
7 pages
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
No ratings yet
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
21 pages
It Help Desk Resume Objective Examples
100% (1)
It Help Desk Resume Objective Examples
5 pages
Google Chrome Extension
No ratings yet
Google Chrome Extension
28 pages
Accounting Information Systems 6th Edition James A. Hall - The Ebook Is Ready For Download To Explore The Complete Content
No ratings yet
Accounting Information Systems 6th Edition James A. Hall - The Ebook Is Ready For Download To Explore The Complete Content
47 pages
05 DocumentStores
No ratings yet
05 DocumentStores
50 pages
No SQL Module 4
No ratings yet
No SQL Module 4
11 pages
Module 4
No ratings yet
Module 4
36 pages
Blockchain Ia2 Answers
No ratings yet
Blockchain Ia2 Answers
19 pages
Macromedia Director
No ratings yet
Macromedia Director
5 pages
MongoDB Features
No ratings yet
MongoDB Features
9 pages
Lecture 40 1
No ratings yet
Lecture 40 1
18 pages
Mongodb Session 4
No ratings yet
Mongodb Session 4
59 pages
12 MongoDB Design Patterns Part 1
No ratings yet
12 MongoDB Design Patterns Part 1
24 pages
NoSQL-Module 4
No ratings yet
NoSQL-Module 4
11 pages
6 Transaction
No ratings yet
6 Transaction
49 pages
Second Sessional Exam Schedule.
No ratings yet
Second Sessional Exam Schedule.
14 pages
CHAP1 No SQL Database - 085309
No ratings yet
CHAP1 No SQL Database - 085309
72 pages
Chapter 6 AI Application Integration Product Testing
No ratings yet
Chapter 6 AI Application Integration Product Testing
22 pages
noSQL Module-4 (Sindhu)
No ratings yet
noSQL Module-4 (Sindhu)
9 pages
Screenshot 2024-09-21 at 8.36.35 AM
No ratings yet
Screenshot 2024-09-21 at 8.36.35 AM
31 pages
Deep Learning Totally From Scratch
No ratings yet
Deep Learning Totally From Scratch
52 pages
Mongo DB
No ratings yet
Mongo DB
227 pages
STL 2.0: A Proposal For A Universal Multi-Material Additive Manufacturing File Format
No ratings yet
STL 2.0: A Proposal For A Universal Multi-Material Additive Manufacturing File Format
13 pages
NoSQL Unit 3
No ratings yet
NoSQL Unit 3
65 pages
O7tygtemdb2j DF300 010 BeyondStorage
No ratings yet
O7tygtemdb2j DF300 010 BeyondStorage
24 pages
Nosql Module 4.
No ratings yet
Nosql Module 4.
8 pages
1.0 Transformation of Computing
No ratings yet
1.0 Transformation of Computing
20 pages
Nosql Mod3
No ratings yet
Nosql Mod3
18 pages
Module 7 - NoSQL
No ratings yet
Module 7 - NoSQL
34 pages
Module 3 MongoDB
No ratings yet
Module 3 MongoDB
8 pages
Lecture 40 1
No ratings yet
Lecture 40 1
22 pages
Excel Fundamentals Manual 46
No ratings yet
Excel Fundamentals Manual 46
1 page
Mongodb
No ratings yet
Mongodb
60 pages
Current Log
No ratings yet
Current Log
55 pages
Perkominfo No. 5 Tahun 2020 Tentang Penyelenggaraan Sistem Elektronik Lingkup Privat (English Ver.)
No ratings yet
Perkominfo No. 5 Tahun 2020 Tentang Penyelenggaraan Sistem Elektronik Lingkup Privat (English Ver.)
21 pages
QB 1
No ratings yet
QB 1
9 pages
NGD Question Bank Answers
No ratings yet
NGD Question Bank Answers
41 pages
Basic Shell Scripting
No ratings yet
Basic Shell Scripting
6 pages
Udbms (Unit 3)
No ratings yet
Udbms (Unit 3)
9 pages
Module 5
No ratings yet
Module 5
32 pages
Unit 2
No ratings yet
Unit 2
85 pages
ADB - Lab Sheet 6
No ratings yet
ADB - Lab Sheet 6
9 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
Notes For Question Bank
No ratings yet
Notes For Question Bank
17 pages
DF100 - 02 - Storage and Retrieval Part 1
No ratings yet
DF100 - 02 - Storage and Retrieval Part 1
28 pages
MongoDB Cheat Sheet
No ratings yet
MongoDB Cheat Sheet
9 pages
Mongo
No ratings yet
Mongo
58 pages
MongoDB Database Model
No ratings yet
MongoDB Database Model
7 pages
Wafl PDF
No ratings yet
Wafl PDF
36 pages
Micro Processors: Case Study Summary
No ratings yet
Micro Processors: Case Study Summary
6 pages
Mongo DB
No ratings yet
Mongo DB
5 pages
BDA Unit 5
No ratings yet
BDA Unit 5
61 pages
LifeUp App Operating Manual
No ratings yet
LifeUp App Operating Manual
16 pages
Prev CSC Log
No ratings yet
Prev CSC Log
2 pages
Real Estate Business Entity Program
No ratings yet
Real Estate Business Entity Program
20 pages
WUBR 170GN P4 Approval Sheet 1.1
No ratings yet
WUBR 170GN P4 Approval Sheet 1.1
14 pages
Curiculum Vitae
No ratings yet
Curiculum Vitae
1 page
Unit IV
No ratings yet
Unit IV
50 pages
Class 8 - Computer
100% (1)
Class 8 - Computer
15 pages
MongoDB Lecture 1
No ratings yet
MongoDB Lecture 1
37 pages
Hospital Management System
0% (1)
Hospital Management System
23 pages
Demystifying The Number of Vcpus For Optimal Workload Performance
No ratings yet
Demystifying The Number of Vcpus For Optimal Workload Performance
12 pages
Acorn DB
No ratings yet
Acorn DB
12 pages
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
No ratings yet
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
13 pages
Module 3 Mongodb
No ratings yet
Module 3 Mongodb
10 pages
Document Database
No ratings yet
Document Database
25 pages
NGT Unit 2 - 230630 - 094118
No ratings yet
NGT Unit 2 - 230630 - 094118
62 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
10 pages
BDA3
No ratings yet
BDA3
3 pages
MST Unit-5
No ratings yet
MST Unit-5
14 pages
Mean Stack Technologies Unit-5
No ratings yet
Mean Stack Technologies Unit-5
9 pages
Mongodb
No ratings yet
Mongodb
19 pages
Mongodb Cheat Sheet: Click Here
No ratings yet
Mongodb Cheat Sheet: Click Here
17 pages
GSM GPRS GPS Tracking System User Manual: Model: TK06B
No ratings yet
GSM GPRS GPS Tracking System User Manual: Model: TK06B
12 pages
BDA Unit-4
No ratings yet
BDA Unit-4
12 pages
02 - Document-Based and MongoDB
No ratings yet
02 - Document-Based and MongoDB
133 pages
MongoDB Intro
No ratings yet
MongoDB Intro
30 pages
MIS Midterm Review
No ratings yet
MIS Midterm Review
38 pages
Mongo DB
No ratings yet
Mongo DB
30 pages
Mẫu Câu Hỏi Trong Đề Thi Hkico I - Java
No ratings yet
Mẫu Câu Hỏi Trong Đề Thi Hkico I - Java
7 pages
21 Mongo DB
No ratings yet
21 Mongo DB
104 pages
Java Web Start Configuration
No ratings yet
Java Web Start Configuration
6 pages
Mongo DB
No ratings yet
Mongo DB
8 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
ADT
No ratings yet
ADT
34 pages
MongoDB Case Study 1
No ratings yet
MongoDB Case Study 1
6 pages
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)
Learn MongoDB in 24 Hours
From Everand
Learn MongoDB in 24 Hours
Alex Nordeen
5/5 (2)
Access Control Based On 802.1x (SRAN18.1 - Draft A)
No ratings yet
Access Control Based On 802.1x (SRAN18.1 - Draft A)
28 pages
Mongodb Notes
No ratings yet
Mongodb Notes
8 pages
Cadworx Pipe P&id
No ratings yet
Cadworx Pipe P&id
6 pages
L48 - MongoDB
No ratings yet
L48 - MongoDB
31 pages
Module 4 Nosql
No ratings yet
Module 4 Nosql
8 pages