0% found this document useful (0 votes)

47 views24 pages

12 MongoDB Design Patterns Part 1

Uploaded by

Tran Thanh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views24 pages

12 MongoDB Design Patterns Part 1

Uploaded by

Tran Thanh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

▪ Unlike SQL databases, collections do not require its documents to have the same

schema, i.e., the following properties might change:

▪ the set of fields and
▪ the data type for the same field

▪ In practice, however, documents in a collection share a similar structure

▪ Which is the best document structure?
▪ Are there patterns to address common applications?

▪ It is possible to enforce document validation rules for a collection during update

and insert operations

1
2
▪ A write operation is atomic on the level of a single document, even if the
operation modifies multiple embedded documents within a single document
▪ When a single write operation (e.g. db.collection.updateMany()) modifies multiple
documents, the modification of each document is atomic, but the operation as a
whole is not atomic
▪ For situations requiring atomicity of reads and writes to multiple documents (in a
single or multiple collections), MongoDB supports multi-document transactions:
▪ in version 4.0, MongoDB supports multi-document transactions on replica sets
▪ in version 4.2, MongoDB introduces distributed transactions, which adds support for
multi-document transactions on sharded clusters and incorporates the existing support
for multi-document transactions on replica sets

3
MongoDB can perform schema validation during updates and insertions. Existing
documents do not undergo validation checks until modification.
▪ validator: specify validation rules or expressions for the collection
▪ validationLevel: determines how strictly MongoDB applies validation rules to existing
documents during an update
▪ strict, the default, applies to all changes to any document of the collection
▪ moderate, applies only to existing documents that already fulfill the validation criteria or to inserts

▪ validationAction: determines whether MongoDB should raise error and reject documents
that violate the validation rules or warn about the violations in the log but allow invalid
documents

db.createCollection( <name>,
{validator: <document>,
validationLevel: <string>,
validationAction: <string>,
}) 4
▪ Starting in version 3.6, MongoDB supports JSON Schema validation (recommended)

▪ To specify JSON Schema validation, use the $jsonSchema operator

db.createCollection("students",
{ validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "year" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
year: {
bsonType: "int",
minimum: 2000,
maximum: 2099,
description: "must be an integer in [2000, 2099] and is required»
}
}
}
}
})
5
In addition to JSON Schema validation that uses the $jsonSchema query operator,
MongoDB supports validation with other query operators, except for:
▪ $near, $nearSphere, $text, and $where operators
▪ Note: users can bypass document validation with bypassDocumentValidation option.

db.createCollection( "contacts",
{ validator: {
$or: [
{ phone: { $type: "string" } },
{ email: { $regex: /@mongodb\.com$/ } },
{ status: { $in: [ "Unknown", "Incomplete" ] } }
]
}
})
6
▪ Atomicity
▪ Embedded Data Model vs Multi-Document Transaction

▪ Sharding
▪ selecting the proper shard key has significant implications for performance, and can
enable or prevent query isolation and increased write capacity
▪ Indexes
▪ each index requires at least 8 kB of data space.
▪ adding an index has some negative performance impact for write operations
▪ collections with high read-to-write ratio often benefit from additional indexes
▪ when active, each index consumes disk space and memory

▪ Data Lifecycle Management

▪ the Time to Live feature of collections expires documents after a period of time

7
▪ Approximation
▪ Attribute
▪ Bucket
▪ Computed
▪ Document Versioning
▪ Extended Reference
▪ Outlier
▪ Pre-allocation
▪ Polymorphic
▪ Schema Versioning
▪ Subset
▪ Tree source: https://www.mongodb.com/blog/post/building-with-patterns-the-extended-reference-pattern
8
▪ Let's say that our city planning strategy
is based on needing one fire engine
per 10,000 people.
▪ instead of updating the population in
the database with every change, we
could build in a counter and only
update by 100, 1% of the time.
▪ Another option might be to have a
function that returns a random number.
If, for example, that function returns a
number from 0 to 100, it will return 0
around 1% of the time. When that
condition is met, we increase the
counter by 100.
▪ Our writes are significantly reduced Examples
here, in this example by 99%.
▪ population counter
▪ when working with large amounts of
data, the impact on performance of ▪ movie website counter
write operations is large too. source: https://www.mongodb.com/blog/post/building-with-patterns-the-approximation-pattern

9
▪ Useful when
▪ expensive calculations are
frequently done
▪ the precision of those calculations
is not the highest priority
▪ Pros
▪ fewer writes to the database
▪ no schema change required

▪ Cons
▪ exact numbers aren’t being
represented Examples
▪ implementation must be done in ▪ population counter
the application ▪ movie website counter
source: https://www.mongodb.com/blog/post/building-with-patterns-the-approximation-pattern

10
▪ Let’s think about a collection of
movies.
▪ The documents will likely have
similar fields involved across all
the documents:
▪ title, director, producer, cast, etc.

▪ Let’s say we want to search on the ▪ Move this subset of information into an array
release date: which release date? and reduce the indexing needs. We turn this
Movies are often released on information into an array of key-value pairs
different dates in different
countries.
▪ A search for a release date will
require looking across many
fields at once, we’d need several
indexes on our movies collection.

11
▪ Useful when
▪ there is a subset of fields that
share common characteristics
▪ the fields we need to sort on are
only found in a small subset of
documents
▪ Pros
▪ fewer indexes are needed, e.g.,
{"releases.location": 1,
"releases.date": 1}
▪ queries become simpler to write
and are generally faster
▪ Example
▪ product catalog

Source: https://www.mongodb.com/blog/post/building-
with-patterns-the-attribute-pattern
12
▪ With data coming in as a stream over a period
of time (time series data) we may be inclined
to store each measurement in its own
document, as if we were using a relational
database.
▪ We could end up having to index sensor_id
and timestamp for every single measurement
to enable rapid access.
▪ We can "bucket" this data, by time, into
documents that hold the measurements from
a particular time span. We can also
programmatically add additional information
to each of these "buckets".
▪ Benefits in terms of index size savings,
potential query simplification, and the ability
to use that pre-aggregated data in our
documents.

13
▪ Useful when
▪ needing to manage streaming data
▪ time-series
▪ real-time analytics
▪ Internet of Things (IoT)

▪ Pros
▪ reduces the overall number of
documents in a collection
▪ improves index performance
▪ can simplify data access by leveraging
pre-aggregation, e.g., average
temperature = sum/count
▪ Example
▪ IoT, time series

14
▪ The usefulness of data becomes much
more apparent when we can compute
values from it.
▪ What's the total sales revenue of …?
▪ How many viewers watched …?

▪ These types of questions can be answered

from data stored in a database but must be
computed.
▪ Running these computations every time
they're requested though becomes a
highly resource-intensive process,
especially on huge datasets.
▪ Example: a movie review website, every
time we visit a movie webpage, it provides
information about the number of cinemas
the movie has played in, the total number
of people who've watched the movie, and
the overall revenue.
15
▪ Useful when
▪ very read-intensive data access patterns
▪ data needs to be repeatedly computed by the
application
▪ computation done in conjunction with any
update or at defined intervals - every hour for
example
▪ Pros
▪ reduction in CPU workload for frequent
computations
▪ Cons
▪ it may be difficult to identify the need for this
pattern
▪ Examples
▪ revenue or viewer
▪ time series data
▪ product catalogs
16
▪ In most cases we query only the latest
state of the data.
▪ What about situations in which we need to
query previous states of the data?
▪ What if we need to have some functionality
of version control of our documents?

▪ Goal: keeping the version history of

documents available and usable
▪ Assumptions about the data in the
database and the data access patterns
that the application makes
▪ Limited number of revisions
▪ Limited number of versioned documents
▪ Most of the queries performed are done on
the most recent version of the document
17
▪ An insurance company might make use of this
pattern.
▪ Each customer has a “standard” policy and a
second portion that is specific to that customer.
▪ This second portion would contain a list of policy
add-ons and a list of specific items that are being
insured.

▪ As the customer changes what specific items

are insured, this information needs to be
updated while the historical information
needs to be available as well.
▪ When a customer purchases a new item and
wants it added to their policy, a new
policy_revision document is created using the
current_policy document.
▪ A version field in the document is then
incremented to identify it as the latest revision
and the customer's changes added.
18
The newest revision will be stored in the
current_policies collection and the old version
will be written to the policy_revisions collection.
▪ Pros
▪ easy to implement, even on existing
systems
▪ no performance impact on queries on the
latest revision
▪ Cons
▪ doubles the number of writes
▪ queries need to target the correct
collection
▪ Examples
▪ financial industries
▪ healthcare industries
source: https://www.mongodb.com/blog/post/building-with-patterns-the-document-
versioning-pattern

19
In an e-commerce application ▪ However, the full retrieval of
▪ the order an order requires to join data
from different entities
▪ the customer
▪ the inventory ▪ A customer can have N
orders, creating a 1-N
are separate logical entities relationship
▪ Embedding all the customer
information inside each order
▪ reduce the JOIN operation
▪ results in a lot of duplicated
information
▪ not all the customer data may
be actually needed

20
Instead of embedding (i.e., duplicating) all the data of an external entity (i.e., another
document), we only copy the fields we access frequently.
Instead of including a reference to join the information, we only embed those fields of the
highest priority and most frequently accessed.
▪ Useful when
▪ your application is experiencing lots of JOIN operations to bring together frequently accessed data

▪ Pros
▪ improves performance when there are
a lot of join operations
▪ faster reads and a reduction in the
complexity of data fetching
▪ Cons
▪ data duplication, it works best if such
data rarely change (e.g., user-id, name)
▪ Sometimes duplication of data is better
because you keep the historical values
(e.g., shipping address of the order)
21
22
For further information on the content of these slides,
please refer to the book

“Design with MongoDB”

Best Models for Applications
by Alessandro Fiori

https://flowygo.com/en/projects/design-with-mongodb/

05 DocumentStores
No ratings yet
05 DocumentStores
50 pages
NoSQL Unit 3
No ratings yet
NoSQL Unit 3
65 pages
Mongodb Session 2
100% (1)
Mongodb Session 2
47 pages
DPA Lecture 6
No ratings yet
DPA Lecture 6
69 pages
CHAP1 No SQL Database - 085309
No ratings yet
CHAP1 No SQL Database - 085309
72 pages
Mongo DB Course
No ratings yet
Mongo DB Course
69 pages
MongoDB Manual PDF
No ratings yet
MongoDB Manual PDF
952 pages
MongoDB Manual
No ratings yet
MongoDB Manual
793 pages
Top 200 Data Engineer Interview Question PDF
100% (4)
Top 200 Data Engineer Interview Question PDF
482 pages
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
No ratings yet
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
21 pages
Dbms Unit5 Notes
No ratings yet
Dbms Unit5 Notes
81 pages
Dod Unit4
No ratings yet
Dod Unit4
18 pages
Module 7 - NoSQL
No ratings yet
Module 7 - NoSQL
34 pages
Unit 1
No ratings yet
Unit 1
57 pages
Unit 2
No ratings yet
Unit 2
85 pages
O7tygtemdb2j DF300 010 BeyondStorage
No ratings yet
O7tygtemdb2j DF300 010 BeyondStorage
24 pages
MongoDB Manual
No ratings yet
MongoDB Manual
944 pages
MongoDb Imp
No ratings yet
MongoDb Imp
21 pages
Data Modeling With Mongodb
No ratings yet
Data Modeling With Mongodb
22 pages
Lecture 40 1
No ratings yet
Lecture 40 1
22 pages
MongoDB Features
No ratings yet
MongoDB Features
9 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
Mongodb
No ratings yet
Mongodb
49 pages
Data Collection Procedures in Research Methodology PDF
100% (2)
Data Collection Procedures in Research Methodology PDF
30 pages
Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
Lecture 9 - MongoDB
No ratings yet
Lecture 9 - MongoDB
8 pages
NoSQL and MongoDB
No ratings yet
NoSQL and MongoDB
24 pages
Nosql Mod4
No ratings yet
Nosql Mod4
12 pages
Notes For Question Bank
No ratings yet
Notes For Question Bank
17 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
21 Mongo DB
No ratings yet
21 Mongo DB
104 pages
mongoDB 1
No ratings yet
mongoDB 1
23 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
DF100 - 02 - Storage and Retrieval Part 1
No ratings yet
DF100 - 02 - Storage and Retrieval Part 1
28 pages
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
No ratings yet
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
37 pages
mongoDB - 2 LIMIT and SELECTORS
No ratings yet
mongoDB - 2 LIMIT and SELECTORS
27 pages
Chapitre 4 MongoDB
No ratings yet
Chapitre 4 MongoDB
27 pages
MongoDB Cheat Sheet
No ratings yet
MongoDB Cheat Sheet
9 pages
MongoDB Database Model
No ratings yet
MongoDB Database Model
7 pages
Java Web Development With MongoDB (Presented at Devoxx 2010)
No ratings yet
Java Web Development With MongoDB (Presented at Devoxx 2010)
129 pages
G8-HBase 2
No ratings yet
G8-HBase 2
100 pages
Unit - 3
No ratings yet
Unit - 3
5 pages
Module 3 Mongodb
No ratings yet
Module 3 Mongodb
10 pages
Guided By: Prof-P.K. Deshpande Submitted By: Kiran Zawar USN:2BA12IS020
No ratings yet
Guided By: Prof-P.K. Deshpande Submitted By: Kiran Zawar USN:2BA12IS020
27 pages
An Introduction To Big Data - NoSQL - Data Science
No ratings yet
An Introduction To Big Data - NoSQL - Data Science
14 pages
Mongo DB
No ratings yet
Mongo DB
30 pages
Document Database
No ratings yet
Document Database
25 pages
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
No ratings yet
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
13 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
Mongo DB
No ratings yet
Mongo DB
8 pages
Mongodb Schema Design Part 3
No ratings yet
Mongodb Schema Design Part 3
1 page
Mongodb
No ratings yet
Mongodb
9 pages
L48 - MongoDB
No ratings yet
L48 - MongoDB
31 pages
MongoDB Reference Card
No ratings yet
MongoDB Reference Card
28 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
No ratings yet
A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei Vanderbilt University
81 pages
MongoDB Data Modeling - Sample Chapter
No ratings yet
MongoDB Data Modeling - Sample Chapter
40 pages
MongoDB ReferenceCards
No ratings yet
MongoDB ReferenceCards
28 pages
MongoDB Schema Design Basics
100% (2)
MongoDB Schema Design Basics
51 pages
ps8tc Toc
No ratings yet
ps8tc Toc
11 pages
MongoDB Case Study 1
No ratings yet
MongoDB Case Study 1
6 pages
Generic Data Access Layer in C# Using Factory Pattern
100% (1)
Generic Data Access Layer in C# Using Factory Pattern
16 pages
Factors Affecting Women Participation in Leadershi
No ratings yet
Factors Affecting Women Participation in Leadershi
12 pages
Database Management System (Data Modelling) Answer Key - Activity 2
No ratings yet
Database Management System (Data Modelling) Answer Key - Activity 2
17 pages
Aiml Data Preprocessing
No ratings yet
Aiml Data Preprocessing
99 pages
Inti Go Docs
No ratings yet
Inti Go Docs
12 pages
OPM Custom Source
No ratings yet
OPM Custom Source
9 pages
Cohesity BestPractices Guide VMware VSphere Data Protection
No ratings yet
Cohesity BestPractices Guide VMware VSphere Data Protection
32 pages
MGT555 Individual Assignment 1
No ratings yet
MGT555 Individual Assignment 1
11 pages
D1T2 - Mark Dowd &amp Tarjei Mandt - iOS6 Security
No ratings yet
D1T2 - Mark Dowd &amp Tarjei Mandt - iOS6 Security
71 pages
Home Stay Proposal
100% (1)
Home Stay Proposal
4 pages
DR Ayo - Patriarchy and Female Objectification in Selected Proverbs of Pete Edochie
No ratings yet
DR Ayo - Patriarchy and Female Objectification in Selected Proverbs of Pete Edochie
41 pages
Businesses 04 00038 v2
No ratings yet
Businesses 04 00038 v2
65 pages
Process Navigator Overview
No ratings yet
Process Navigator Overview
16 pages
Content
No ratings yet
Content
38 pages
SAP Workload Monitor ST03N
No ratings yet
SAP Workload Monitor ST03N
3 pages
Artificial Intelligence in Central Banking
No ratings yet
Artificial Intelligence in Central Banking
9 pages
CS202 B wk13 L17L18 Ashah
No ratings yet
CS202 B wk13 L17L18 Ashah
67 pages
Bhavika Bhatia MBA2C
No ratings yet
Bhavika Bhatia MBA2C
49 pages
Ch9 Virtual Memory Edited
No ratings yet
Ch9 Virtual Memory Edited
65 pages
Basic Elements
No ratings yet
Basic Elements
14 pages
Bigchaindb Whitepaper
No ratings yet
Bigchaindb Whitepaper
65 pages
To Be Considered True Research
No ratings yet
To Be Considered True Research
22 pages
Lecture 2 Serialization Basics 1.5 Hours
No ratings yet
Lecture 2 Serialization Basics 1.5 Hours
10 pages
Tableau Pulse Datasheet
No ratings yet
Tableau Pulse Datasheet
2 pages
SqlDependency - Start ( - Connect) Makes These DB Calls: Select Is - Broker - Enabled
No ratings yet
SqlDependency - Start ( - Connect) Makes These DB Calls: Select Is - Broker - Enabled
4 pages
2210 s15 QP 21
No ratings yet
2210 s15 QP 21
2 pages
Using SQL Server To Maintain Session State
No ratings yet
Using SQL Server To Maintain Session State
4 pages
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Learn MongoDB in 24 Hours
From Everand
Learn MongoDB in 24 Hours
Alex Nordeen
5/5 (2)

12 MongoDB Design Patterns Part 1

Uploaded by

12 MongoDB Design Patterns Part 1

Uploaded by

▪ Unlike SQL databases, collections do not require its documents to have the same

schema, i.e., the following properties might change:

▪ In practice, however, documents in a collection share a similar structure

▪ It is possible to enforce document validation rules for a collection during update

▪ To specify JSON Schema validation, use the $jsonSchema operator

▪ Data Lifecycle Management

▪ These types of questions can be answered

▪ Goal: keeping the version history of

▪ As the customer changes what specific items

“Design with MongoDB”

You might also like