0% found this document useful (0 votes)

14 views27 pages

04 Chapter Pattern in Mongodb1

Mongo

Uploaded by

tai43464

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views27 pages

04 Chapter Pattern in Mongodb1

Mongo

Uploaded by

tai43464

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

25/09/2022

PATTERNS IN MONGODB

Contents
˗ What are Patterns?
˗ Significance of Data Modeling Patterns
˗ Patterns in NOSQL Data Modeling

1
25/09/2022

What are Patterns?

˗ Building Blocks
 Identified by our Consulting
Engineers helping customers
for the last 12 years.
˗ Common Language
 Data Architects and
Engineers can easily
reference the same things

What are Patterns?

˗ Patterns are the most powerful tool for designing schemas
for MongoDB and NoSQL.
˗ Patterns are not full solution to problems. Patterns are a
smaller section of those solutions.
˗ Patterns are reusable units of knowledge.
˗ Familiar with software architecture design, patterns will do
for data modeling and schema design for document
databases.

2
25/09/2022

What can patterns do for you?

˗ Improve Performance
 By using no more
resources than you should
˗ Simplify the access to the
data
 By grouping and pre-
arranging data in a simpler
form

Patterns in Schema Design - MongoDB

˗ Benefits of Patterns
 Pattern helps to optimize large documents with subset pattern.
 Use the computed pattern, avoid repeated calculations,
 Handle changes to the system implementation in no time.
 Patterns serve as a common language for teams working on schema
designs.
 Having clear patterns and understanding when and how to use
them eliminates errors in the data model for MongoDB and makes the
process more predictable.

3
25/09/2022

Handling Duplication, Staleness and Integrity

˗ Some concerns the usage of patterns may arise:
 Duplication
• Duplicating data across documents
 Data Staleness
• Accepting staleness in some pieces of data
 Data Integrity Issues:
• Writing extra application side logic to ensure referential integrity

Handling Duplication
˗ Duplication may cause inconsistency when you change one
piece of data while the duplication part not changed.
˗ Cause of duplication: Results embedding information into a
given document for faster access
˗ Concern:
 Duplication makes handling changes to duplicate information
a challenge of correctness and consistency, where multiple
documents in different collections may need to be updated.

4
25/09/2022

Handling Duplication
˗ Duplication is the solution:
In some cases, duplication is
better than no duplication.
˗ Example:
 Let's link orders of products
to the address of the
customer that placed the
order by using a reference to
a customer document.

Handling Duplication
˗ Duplication is the solution:
Example:
 Updating the address for this
customer updates information
for the already fulfilled
shipments, order that have
been already delivered to the
customer.
 This is not the desired
behavior.

5
25/09/2022

Handling Duplication
˗ Duplication is the solution
˗ Example (cont.):
 Embedding a copy of the address within the shipment document will
ensure we keep the correct value.
 When the customer moves, we add another shipping address on file.
 Using this new address for new orders, does not affect the already
shipped orders.

Handling Duplication
˗ Duplication is the solution

6
25/09/2022

Handling Duplication
˗ Duplication has minimal effect: duplication situation to
consider is when the copy data does not ever change
˗ Example:
 Let's say we want to model movies and actors.
 Movies have many actors and actors play in many movies. So this
is a typical many-to-many relationship
 Avoiding duplication in a many-to-many relationship requires us to
keep two collections and create references between the documents
in the two collections.

Handling Duplication
˗ Duplication has minimal effect:
˗ Example (cont.)
 If list the actors in a given movie document, we are creating
duplication.
 However, once the movie is released, the list of actors does not change
 So duplication on this unchanging information is also perfectly
acceptable.

7
25/09/2022

Handling Duplication
˗ Duplication has minimal effect:

Handling Duplication
˗ Duplication should be handled: the duplication of a piece of
information that needs to or may change with time.
˗ Example:
 The revenues for a given movie, which is stored within the movie, and
the revenues earned per screening.
 In this case, we have duplication between the sum store in the movie
document and the revenue store in the screening documents used
to compute the total sum.

8
25/09/2022

Handling Duplication
˗ Duplication should be handled
˗ Example

Handling Duplication
˗ Duplication should be handled:
˗ Example (cont.):
 This situation, we must keep multiple values in sync over time, makes
us ask the question is the benefit of having this sum precomputed
surpassing the cost and trouble of keeping it in sync?
 If yes, then use this computed pattern.
 Here, if we want the sum to be synchronized. Meaning, whenever the
application writes a new document to the collection or updates the
value of an existing document, it must update the sum.

9
25/09/2022

Handling Duplication
˗ Duplication should be handled:
˗ Example (cont.):
 If we want the sum to be synchronized, it may be the responsibility of
the application to keep it in sync. Meaning, whenever the application
writes a new document to the collection or updates the value of an
existing document, it must update the sum.
 But how often should we actually recalculate the sum?
 This brings us to the next concern we must consider when using
patterns, staleness.

Handling Duplication
{//move {//move {//move
_id: “tt0076759”, _id: “tt0076759”, _id: “tt0076759”,
title: “Star war – IV”, title: “Star war – IV”, title: “Star war – IV”,
Gross_revenues: 775000000 Gross_revenues: 775000000} Gross_revenues:
} {//screenings 775025000}
{//screenings date:”1977-05-30”, {//screenings
date: ”1977-05-30”, revenues: 15554475} date:”1977-05-30”,
revenues: 15554475 … revenues: 15554475}
} {date:”2019-05-30”, …
… revenues: 5000 } {date:”2019-05-30”,
{//screenings revenues: 5000 }
date: ”2019-05-30”, {date:”2019-05-30”,
revenues: 5000 revenues: 25000} {date:”2019-05-30”,
} revenues: 25000}

10
25/09/2022

Handling Staleness

Handling Staleness
˗ Due to globalization and the world being flatter, systems are
now accessed by millions of concurrent users, impacting the
ability to display up-to-the-second data to all these users
more challenging.
˗ Example:
 The availability of a product that is shown to a user may still have to be
confirmed at checkout time.
 The prices of plane tickets or hotel rooms that change right before you
book them

11
25/09/2022

Handling Staleness
˗ Why do we get this staleness?
 New events come along at such a fast rate that updating data
constantly can cause performance issues.
˗ The main concern when solving this issue is data quality
and reliability.
˗ The issues
 How long can the user tolerate not seeing the most up-to-date value for
a specific field?
 Analytic queries are often run on the secondary node, which often may
have stale data

Handling Staleness
˗ Resolve Staleness:
 The solution to resolve staleness in the world of big data is to batch
updates.

 Change Stream's a new application to access and respond to data

changes, either in real time or in a delayed mode.

12
25/09/2022

Handling Referential Integrity

˗ Referential integrity has some similarities to staleness.
˗ Why?
 information between documents or tables
 No support for cascading deletes
˗ Concern?
 Challenge for correctness and consistency

Handling Referential Integrity

˗ Resolve Referential Integrity
 Change Stream
• For delayed referential integrity, we can, rely on
change streams.
 Single Document
• We can avoid using references by embedding
information in a single document, instead of linking
it.
 Multi Documents Transaction
• We can use MongoDB with be multi-document
transactions to update multiple documents at once

13
25/09/2022

Recap
˗ For a given piece of data
 Should or could the information be duplicated or not?
• Resolve with bulk updates
 What is the tolerated or acceptable staleness?
• Resolve with updates based on change streams
 Which pieces of data require referential integrity?
• Resolve or prevent the inconsistencies with change stream or
transactions

Patterns in NOSQL Data

Modeling

14
25/09/2022

Patterns in NOSQL Data Modeling

1. Attribute Pattern
2. Extended Reference Pattern
3. Subset Pattern
4. Computed Pattern
5. Bucket Pattern
6. Schema Versioning Pattern
7. Tree Patterns
8. Polymorphic Pattern
9. Other Patterns

Attribute Pattern
˗ The attribute pattern is orthogonal to polymorphic. It helps to
organize fields that have either common characteristics you
want to search across, or fields that are rare (hiếm), or when
you need to manage an influx of unpredictable properties.
˗ Attribute pattern potentially reduces the number of indexes.
˗ To use attribute pattern, transpose the key/values of the
desired properties into an array of documents.

15
25/09/2022

Attribute Pattern
˗ Example:
 Products have an identification like brand, manufacturer, sub-brand,
enterprise that are common across the majority of products
 Products' additional fields that are common across many products, like
color and size-- either these values may have different units and
means different things for the different products.

Attribute Pattern
˗ Example (cont.):

16
25/09/2022

Attribute Pattern
˗ Example (cont.)
 The size of a beverage made in the US maybe measured as ounces,
while the same drink in Europe will be measured in milliliters.
 The MongoDB charger, the size is measured according to its three
dimensions.
 The size of a Cherry Coke six-pack, 12 ounces for a single can, six
times 12 ounces, or 72 ounces to count the full six-pack.
 We could list the physical dimension and report the amount of the
liquid in that field

Attribute Pattern
˗ Example (cont.)
 The third list of fields, the set of fields that are not going to exist in all
the products. They may exist in the new description that your supplier
is providing you
 For a sugary drink, you may want to know the type of sweetener,
while for a battery, you are more interested in its specifications, like the
amount of electricity provide
 Schema and indexing may appear in the third list of fields.
 To search effectively on one of those fields, you need an index.

17
25/09/2022

Attribute Pattern
˗ Example (cont.)
 Searching on the capacity for my battery would require an index.
 Searching on the voltage output of my battery would also require an
index.
 If you have tons of fields, you may have a lot of indexes.

Using the attribute pattern

˗ How to use attribute
pattern
 Identifying the list of fields
you want to transpose.
 For each field in associated
value, we create that pair.
˗ Example:
 We transpose the fields input,
output, and capacity.

18
25/09/2022

Using the attribute pattern

˗ Example (cont.)
 For consistency, let's use K for key and V for value, as some of our
aggregation functions do.
 Under the field name K, we put the name of the original field as the
value
 For the first one, the field was named "input," so that became the value
for K.
 Then the value for input was five volts or 1,300 milliamps, so this is the
value for the field V

Using the attribute pattern

˗ Example (cont.):

19
25/09/2022

Using the attribute pattern

˗ Example (cont.)
 Repeating the same thing for the original field's output and capacity,
we get three documents, each adding a K and a V in them.

Using the attribute pattern

˗ Example (cont.)
 Because of their similar shape it is easy to place them together under
an "add_specs" for additional specs array.
 Note that for the third field, not only do I transpose it to a key value
pair, but that also added a third field called U to store some units
separately.
 This third field qualifies the relationship between K and the V.

20
25/09/2022

Using the attribute pattern

˗ Example (cont.)
 The last thing to do is to create an index for all that info.
 This is done by creating an index on "add_specs.k" and
"add_specs.v."

Fields that share Common Characteristics

˗ Another scenario: we have a document representing a movie
 In the document, there are several fields to keep track of when the
movie was released.
 In this case, we keep track of the dates when a movie was released in
the USA, in Mexico, and France, and when it appears in the San Jose
movie festival
 One thing to observe with those fields is that they share the same type
of value: the type, release date.

21
25/09/2022

Fields that share Common Characteristics

˗ Another scenario

Fields that share Common Characteristics

˗ Question: What if we want to find all the movies released
between two dates across all countries?
 I would have to list all the countries in the festival for each of
these,
 Run a separate query for the range certain and aggregate all
my results.

22
25/09/2022

Fields that share Common Characteristics

˗ Using the attribute pattern and transforming those release
dates to an array of field pairs, we can change the query to this.

Fields that share Common Characteristics

˗ Problem
 The attribute pattern ˗ Solution
addresses the problem of  Break the field/value into a
having a lot of similar fields in sub-document with:
a document. • fieldA: field
 Search across many fields at • fieldB: value
once  Example:
 Fields present in only a subset • {“color”:”Blue”, “size”:
of the documents have many “large”}
similar fields. • {[{“k”:”color”, “v”: “Blue”},
• {“k”:”size”, “v”: “large”}]}

23
25/09/2022

Fields that share Common Characteristics

˗ User case example
 Characteristics of a product ˗ Benefit and trade – Offs
 Set of fields all having same  Easier to index
value type  Allow fo non-deterministic field
• List of dates names
 With movies, where a different  Ability to quality the
location can have different relationship of the original field
release dates and value

Summary
˗ The attribute pattern
 Orthogonal Pattern to polymorphism
 Add organization for
• Common characteristics
• Rare/unpredictable fields
 Reduces number of indexes
 Transpose keys/values as
• Array of sub-documents of form:
• {“k”:”key”, “v”:”value”}

24
25/09/2022

Lab: Apply the Attribute Pattern

Problem: User Story
 The museum we work at has grown from a local attraction to one that
is seen as having very popular items.
 For this reason, other museums in the World have started exchanging
pieces of art with our museum.
 Our database was tracking if our pieces are on display and where they
are in the museum.
 To track the pieces we started exchanging with other museum, we
added an array called events, in which we created an entry for each
date a piece was loaned and the museum it was loaned to.

Lab: Apply the Attribute Pattern

˗ Problem: User Story

25
25/09/2022

Lab: Apply the Attribute Pattern

˗ Problem: User Story
 The problem with this design is that we need to build a new index
every time there is a new museum with which we start exchanging
pieces.
 For example, when we started working with The Prado in Madrid, we
needed to add this index:

{ "events.prado" : 1 }

Lab: Apply the Attribute Pattern

˗ Task: To address this issue, you've decided to change the
schema to:
 Use a single index on all event dates.
 Transform the field that tracks the date when a piece was acquired,
date_acquisition, so that it is also indexed with the values above.
 To ensure the validator can verify your solution, use "k" and "v" as field
names if needed.

26
25/09/2022

Lab: Apply the Attribute Pattern

˗ To complete this lab:

Modify the following
schema to incorporate
the above changes:

Lab: Apply the Attribute Pattern

˗ Save your new schema to a file named pattern_attribute.json.
˗ Validate your answer on Windows by running in the CMD shell:

validate_m320 pattern_attribute --file

pattern_attribute.json

SQL Patterns v1.5
100% (1)
SQL Patterns v1.5
113 pages
Nosql Notes
No ratings yet
Nosql Notes
110 pages
The NOSQL CheatSheet
No ratings yet
The NOSQL CheatSheet
7 pages
Mongodb Session 2
100% (1)
Mongodb Session 2
47 pages
Introduction To Big Data and NoSQL
No ratings yet
Introduction To Big Data and NoSQL
52 pages
The Star Wars
No ratings yet
The Star Wars
200 pages
4 Pattern in MongoDB1
No ratings yet
4 Pattern in MongoDB1
53 pages
Sword and Fist
100% (4)
Sword and Fist
98 pages
04 Chapter Introduction To Pattern New
No ratings yet
04 Chapter Introduction To Pattern New
23 pages
NoSQL Unit 3
No ratings yet
NoSQL Unit 3
65 pages
04 Chapter Pattern in Mongodb2
No ratings yet
04 Chapter Pattern in Mongodb2
32 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Pompa (P-01 Sampai P-07)
No ratings yet
Pompa (P-01 Sampai P-07)
62 pages
Mongo
No ratings yet
Mongo
126 pages
Sem 2 L5
No ratings yet
Sem 2 L5
72 pages
8 Data Modeling Patterns in Redis
No ratings yet
8 Data Modeling Patterns in Redis
56 pages
04 Chapter Pattern in Mongodb3
No ratings yet
04 Chapter Pattern in Mongodb3
38 pages
12 MongoDB Design Patterns Part 1
No ratings yet
12 MongoDB Design Patterns Part 1
24 pages
Advanced Schema Design Patterns: #Mdblocal
No ratings yet
Advanced Schema Design Patterns: #Mdblocal
46 pages
Data Mining 1
No ratings yet
Data Mining 1
36 pages
Software Engineer Concepts - 4030afdb-00a4-4f83-A520 - 241007 - 202416
No ratings yet
Software Engineer Concepts - 4030afdb-00a4-4f83-A520 - 241007 - 202416
26 pages
DBMS Piyushwairale
No ratings yet
DBMS Piyushwairale
50 pages
Unit 2
No ratings yet
Unit 2
146 pages
Lecture 27
No ratings yet
Lecture 27
19 pages
T3 CRUD Operation PDF
No ratings yet
T3 CRUD Operation PDF
43 pages
ACC 222 Costing
No ratings yet
ACC 222 Costing
17 pages
Ooase Imp Ques 2
No ratings yet
Ooase Imp Ques 2
13 pages
Nosql Data Management
No ratings yet
Nosql Data Management
13 pages
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
No ratings yet
16-GET, PUT, DeLETE in Key Value Pair, Embedded Vs Capped Document-06!06!2025
21 pages
FeelingFaces Cards En-Blank
No ratings yet
FeelingFaces Cards En-Blank
4 pages
Redis in Action
100% (1)
Redis in Action
51 pages
CAUHOI
No ratings yet
CAUHOI
16 pages
Mongodb Programs
No ratings yet
Mongodb Programs
27 pages
No SQL
No ratings yet
No SQL
12 pages
Module 7 - NoSQL
No ratings yet
Module 7 - NoSQL
34 pages
Sumaya DS Unit 3
No ratings yet
Sumaya DS Unit 3
13 pages
ACID
No ratings yet
ACID
5 pages
Notes For Question Bank
No ratings yet
Notes For Question Bank
17 pages
Module 2
No ratings yet
Module 2
40 pages
English (302) : This Question Paper Consists of 26 Questions (Section-A (16) +Section-B (5+5) ) and 11 Printed Pages
100% (1)
English (302) : This Question Paper Consists of 26 Questions (Section-A (16) +Section-B (5+5) ) and 11 Printed Pages
36 pages
Module 5 Part II NoSQL DB
No ratings yet
Module 5 Part II NoSQL DB
12 pages
Unit-3 BDA
No ratings yet
Unit-3 BDA
21 pages
Module-2 NOSQL
No ratings yet
Module-2 NOSQL
5 pages
Ads Ise 2
No ratings yet
Ads Ise 2
11 pages
Light Activated Switch Circuit Diagram
100% (1)
Light Activated Switch Circuit Diagram
2 pages
Fast Data Smart and at Scale
No ratings yet
Fast Data Smart and at Scale
51 pages
Aggregrate Data Models
No ratings yet
Aggregrate Data Models
9 pages
Bda 2-1
No ratings yet
Bda 2-1
3 pages
Normalization: Problems of Data Redundancy
No ratings yet
Normalization: Problems of Data Redundancy
15 pages
CoSM Vision Plan 2018 Small
No ratings yet
CoSM Vision Plan 2018 Small
64 pages
8 Data Modeling Patterns in Redis
No ratings yet
8 Data Modeling Patterns in Redis
56 pages
Mongo DB
No ratings yet
Mongo DB
8 pages
MongoDB Data Models Guide
100% (1)
MongoDB Data Models Guide
39 pages
Design Document Database
No ratings yet
Design Document Database
62 pages
BDA Questions
No ratings yet
BDA Questions
8 pages
DBMS 2
No ratings yet
DBMS 2
33 pages
CBSE Class 10 Science Qs Paper 2016 Set 2
No ratings yet
CBSE Class 10 Science Qs Paper 2016 Set 2
24 pages
The Muncaster Steam-Engine Models: 3-Simple Slide-Valve Engines
No ratings yet
The Muncaster Steam-Engine Models: 3-Simple Slide-Valve Engines
3 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
43 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Node Functions of MM
No ratings yet
Node Functions of MM
3 pages
Fundamentals of Information Technology
No ratings yet
Fundamentals of Information Technology
2 pages
637 Service Manual
No ratings yet
637 Service Manual
339 pages
Gamer Printshop - Rude Awakening
100% (1)
Gamer Printshop - Rude Awakening
20 pages
Mongodb Schema Design Part 3
No ratings yet
Mongodb Schema Design Part 3
1 page
ESP32 Microcontroller Based Smart Power
No ratings yet
ESP32 Microcontroller Based Smart Power
8 pages
Design School
No ratings yet
Design School
22 pages
Cybersecurity Mesh
No ratings yet
Cybersecurity Mesh
12 pages
Risk Management and Laboratory Safety
No ratings yet
Risk Management and Laboratory Safety
23 pages
Composite Materials Group Composites On Meso - Macro Level: Stepan V. Lomov
No ratings yet
Composite Materials Group Composites On Meso - Macro Level: Stepan V. Lomov
34 pages
ICOMOS, 2004. The WHL Filling The Gaps
No ratings yet
ICOMOS, 2004. The WHL Filling The Gaps
98 pages
Access - Catalog.805b.Color - DP&Casing Tools-46
No ratings yet
Access - Catalog.805b.Color - DP&Casing Tools-46
1 page
Spin Coherent State Through Path Integral & Semi-Classical Physics
No ratings yet
Spin Coherent State Through Path Integral & Semi-Classical Physics
44 pages
Solar Plate
No ratings yet
Solar Plate
13 pages
Where Can Buy Gentrification 1st Edition Loretta Lees Ebook With Cheap Price
No ratings yet
Where Can Buy Gentrification 1st Edition Loretta Lees Ebook With Cheap Price
67 pages
NESessity v1.3 Parts List
No ratings yet
NESessity v1.3 Parts List
3 pages
MAT221 Course Outline
No ratings yet
MAT221 Course Outline
1 page
Chapter 4 Developmental Psych Notes
No ratings yet
Chapter 4 Developmental Psych Notes
7 pages
Activity 6-2 Name: Joella Mae Escanda Section: B
No ratings yet
Activity 6-2 Name: Joella Mae Escanda Section: B
2 pages
Townhouses in Munich-Riem: Detail 2012 D 2
No ratings yet
Townhouses in Munich-Riem: Detail 2012 D 2
6 pages
Reading 7-8 - Klassy
No ratings yet
Reading 7-8 - Klassy
5 pages
CR03 - PPAP-Flammability-IMDS-OTOP Status
No ratings yet
CR03 - PPAP-Flammability-IMDS-OTOP Status
1 page
GTSR Ir 6653
No ratings yet
GTSR Ir 6653
2 pages
Data Engineering with Google Cloud Platform: A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud
From Everand
Data Engineering with Google Cloud Platform: A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud
Adi Wijaya
No ratings yet
Trends In Project Management
From Everand
Trends In Project Management
Quay Consulting
No ratings yet
Azure Fundamentals Success Kit
From Everand
Azure Fundamentals Success Kit
PRIYANKA
No ratings yet
Moving Your Business to the Cloud (A Guide for Business People Shifting to eCommerce)
From Everand
Moving Your Business to the Cloud (A Guide for Business People Shifting to eCommerce)
Keith Foote
No ratings yet
Mainframe to Cloud Mastery: Best Practices: Mainframes
From Everand
Mainframe to Cloud Mastery: Best Practices: Mainframes
Ricardo Nuqui
No ratings yet
Cloud Brokering
From Everand
Cloud Brokering
Felipe Díaz-Sánchez
No ratings yet
The Cloud Adoption Playbook: Proven Strategies for Transforming Your Organization with the Cloud
From Everand
The Cloud Adoption Playbook: Proven Strategies for Transforming Your Organization with the Cloud
Moe Abdula
No ratings yet