04 Chapter Pattern in Mongodb1
04 Chapter Pattern in Mongodb1
PATTERNS IN MONGODB
Contents
˗ What are Patterns?
˗ Significance of Data Modeling Patterns
˗ Patterns in NOSQL Data Modeling
1
25/09/2022
2
25/09/2022
3
25/09/2022
Handling Duplication
˗ Duplication may cause inconsistency when you change one
piece of data while the duplication part not changed.
˗ Cause of duplication: Results embedding information into a
given document for faster access
˗ Concern:
Duplication makes handling changes to duplicate information
a challenge of correctness and consistency, where multiple
documents in different collections may need to be updated.
4
25/09/2022
Handling Duplication
˗ Duplication is the solution:
In some cases, duplication is
better than no duplication.
˗ Example:
Let's link orders of products
to the address of the
customer that placed the
order by using a reference to
a customer document.
Handling Duplication
˗ Duplication is the solution:
Example:
Updating the address for this
customer updates information
for the already fulfilled
shipments, order that have
been already delivered to the
customer.
This is not the desired
behavior.
5
25/09/2022
Handling Duplication
˗ Duplication is the solution
˗ Example (cont.):
Embedding a copy of the address within the shipment document will
ensure we keep the correct value.
When the customer moves, we add another shipping address on file.
Using this new address for new orders, does not affect the already
shipped orders.
Handling Duplication
˗ Duplication is the solution
6
25/09/2022
Handling Duplication
˗ Duplication has minimal effect: duplication situation to
consider is when the copy data does not ever change
˗ Example:
Let's say we want to model movies and actors.
Movies have many actors and actors play in many movies. So this
is a typical many-to-many relationship
Avoiding duplication in a many-to-many relationship requires us to
keep two collections and create references between the documents
in the two collections.
Handling Duplication
˗ Duplication has minimal effect:
˗ Example (cont.)
If list the actors in a given movie document, we are creating
duplication.
However, once the movie is released, the list of actors does not change
So duplication on this unchanging information is also perfectly
acceptable.
7
25/09/2022
Handling Duplication
˗ Duplication has minimal effect:
Handling Duplication
˗ Duplication should be handled: the duplication of a piece of
information that needs to or may change with time.
˗ Example:
The revenues for a given movie, which is stored within the movie, and
the revenues earned per screening.
In this case, we have duplication between the sum store in the movie
document and the revenue store in the screening documents used
to compute the total sum.
8
25/09/2022
Handling Duplication
˗ Duplication should be handled
˗ Example
Handling Duplication
˗ Duplication should be handled:
˗ Example (cont.):
This situation, we must keep multiple values in sync over time, makes
us ask the question is the benefit of having this sum precomputed
surpassing the cost and trouble of keeping it in sync?
If yes, then use this computed pattern.
Here, if we want the sum to be synchronized. Meaning, whenever the
application writes a new document to the collection or updates the
value of an existing document, it must update the sum.
9
25/09/2022
Handling Duplication
˗ Duplication should be handled:
˗ Example (cont.):
If we want the sum to be synchronized, it may be the responsibility of
the application to keep it in sync. Meaning, whenever the application
writes a new document to the collection or updates the value of an
existing document, it must update the sum.
But how often should we actually recalculate the sum?
This brings us to the next concern we must consider when using
patterns, staleness.
Handling Duplication
{//move {//move {//move
_id: “tt0076759”, _id: “tt0076759”, _id: “tt0076759”,
title: “Star war – IV”, title: “Star war – IV”, title: “Star war – IV”,
Gross_revenues: 775000000 Gross_revenues: 775000000} Gross_revenues:
} {//screenings 775025000}
{//screenings date:”1977-05-30”, {//screenings
date: ”1977-05-30”, revenues: 15554475} date:”1977-05-30”,
revenues: 15554475 … revenues: 15554475}
} {date:”2019-05-30”, …
… revenues: 5000 } {date:”2019-05-30”,
{//screenings revenues: 5000 }
date: ”2019-05-30”, {date:”2019-05-30”,
revenues: 5000 revenues: 25000} {date:”2019-05-30”,
} revenues: 25000}
10
25/09/2022
Handling Staleness
Handling Staleness
˗ Due to globalization and the world being flatter, systems are
now accessed by millions of concurrent users, impacting the
ability to display up-to-the-second data to all these users
more challenging.
˗ Example:
The availability of a product that is shown to a user may still have to be
confirmed at checkout time.
The prices of plane tickets or hotel rooms that change right before you
book them
11
25/09/2022
Handling Staleness
˗ Why do we get this staleness?
New events come along at such a fast rate that updating data
constantly can cause performance issues.
˗ The main concern when solving this issue is data quality
and reliability.
˗ The issues
How long can the user tolerate not seeing the most up-to-date value for
a specific field?
Analytic queries are often run on the secondary node, which often may
have stale data
Handling Staleness
˗ Resolve Staleness:
The solution to resolve staleness in the world of big data is to batch
updates.
12
25/09/2022
13
25/09/2022
Recap
˗ For a given piece of data
Should or could the information be duplicated or not?
• Resolve with bulk updates
What is the tolerated or acceptable staleness?
• Resolve with updates based on change streams
Which pieces of data require referential integrity?
• Resolve or prevent the inconsistencies with change stream or
transactions
14
25/09/2022
Attribute Pattern
˗ The attribute pattern is orthogonal to polymorphic. It helps to
organize fields that have either common characteristics you
want to search across, or fields that are rare (hiếm), or when
you need to manage an influx of unpredictable properties.
˗ Attribute pattern potentially reduces the number of indexes.
˗ To use attribute pattern, transpose the key/values of the
desired properties into an array of documents.
15
25/09/2022
Attribute Pattern
˗ Example:
Products have an identification like brand, manufacturer, sub-brand,
enterprise that are common across the majority of products
Products' additional fields that are common across many products, like
color and size-- either these values may have different units and
means different things for the different products.
Attribute Pattern
˗ Example (cont.):
16
25/09/2022
Attribute Pattern
˗ Example (cont.)
The size of a beverage made in the US maybe measured as ounces,
while the same drink in Europe will be measured in milliliters.
The MongoDB charger, the size is measured according to its three
dimensions.
The size of a Cherry Coke six-pack, 12 ounces for a single can, six
times 12 ounces, or 72 ounces to count the full six-pack.
We could list the physical dimension and report the amount of the
liquid in that field
Attribute Pattern
˗ Example (cont.)
The third list of fields, the set of fields that are not going to exist in all
the products. They may exist in the new description that your supplier
is providing you
For a sugary drink, you may want to know the type of sweetener,
while for a battery, you are more interested in its specifications, like the
amount of electricity provide
Schema and indexing may appear in the third list of fields.
To search effectively on one of those fields, you need an index.
17
25/09/2022
Attribute Pattern
˗ Example (cont.)
Searching on the capacity for my battery would require an index.
Searching on the voltage output of my battery would also require an
index.
If you have tons of fields, you may have a lot of indexes.
18
25/09/2022
19
25/09/2022
20
25/09/2022
21
25/09/2022
22
25/09/2022
23
25/09/2022
Summary
˗ The attribute pattern
Orthogonal Pattern to polymorphism
Add organization for
• Common characteristics
• Rare/unpredictable fields
Reduces number of indexes
Transpose keys/values as
• Array of sub-documents of form:
• {“k”:”key”, “v”:”value”}
24
25/09/2022
25
25/09/2022
{ "events.prado" : 1 }
26
25/09/2022
27