04 Chapter Introduction To Pattern New
04 Chapter Introduction To Pattern New
1. Introduction to Patterns;
2. Handling Duplication;
3. Handling Staleness;
4. Handling Referential Integrity;
Introduction to Patterns
In the previous chapters, we went over the basis of data modeling for MongoDB, from
defining a flexible methodology to identifying the different representations for modeling
relationships.
You should now have all the knowledge needed to attack this week's chapter on patterns.
Patterns are very exciting because they are the most powerful tool for designing schemas
for MongoDB and NoSQL.
This chapter is going to help you produce a solution that can scale and perform under
stress.
The patterns we'll go over in this chapter will become a cookbook of powerful transformation
to unleash the power of MongoDB schemas.
Those patterns will also serve as a common language for your teams working on schema
designs.
9/17/2023 MongoDB Data Model
3
Introduction to Patterns
2. Data Staleness
• Accepting staleness in some pieces of data
Patterns are a way to get the best out of your data model. Often, the main goal is to optimize
your schema to respond to some performance operation or optimize it for a given use case
or access pattern. Although like many things in life, good things come at a cost. Many
patterns lead to some situations that would require some additional actions.
For example, duplicating data across documents, accepting staleness in some pieces of
data, writing extra application side logic to ensure referential integrity.
Choosing a pattern to be applied to your schema requires taking into account these three
concerns. If these concerns are more important than the potential simplicity of performance
gains provided by the pattern, you should not use the pattern.
˗ Concern?
▪ Challenge for correctness and consistency
The shipments were made to the customer's address at that point in time, either when the
order was made or before the customer changed their address. So the address reference in
a given order is unlikely to be changed.
Embedding a copy of the address within the shipment document will ensure we keep the
correct value. When the customer moves, we add another shipping address on file. Using
this new address for new orders does not affect the already shipped orders.
The next duplication situation to consider is when the copy data does not ever change.
9/17/2023 MongoDB Data Model
9
Handling Duplication
2. has minimal effect.
If we list the actors in a given movie document, we are creating duplication. However, once
the movie is released, the list of actors does not change.
In this case, we have a duplication between the sum stored in the movie document and the
revenue stored in the screening documents used to compute the total sum.
This type of situation, where we must keep multiple values in sync over time, makes us ask
the question is the benefit of having this sum precomputed surpassing the cost and trouble
of keeping it in sync?
If yes, then use this computed pattern.If not, don't use it.
Meaning, that whenever the application writes a new document to the collection or updates
the value of an existing document, it must update the sum.
Alternatively, we could add another application or job to do it. But how often should we
actually recalculate the sum?
This brings us to the next concern we must consider when using patterns, staleness.
Due to globalization and the world being flatter, systems are now accessed by millions of
concurrent users, making the ability to display up-to-the-second data to all these users more
challenging.
- Concern?
• Data quality and reliability.
For example, the user's threshold for seeing if something is still available to buy is lower
than knowing how many people view or purchase a given item.
When performing analytic the queries it is often understood that the data may be stale and
that the data being analyzed is based on some past snapshot.
Analytic queries are often run on the secondary node, which often may have stale data. It
may be a fraction of a second or a few seconds out of date.
However, it is enough to break any guarantee that we're looking at the latest data recorded
by the system.
(Change Streams)
9/17/2023 MongoDB Data Model
19
Handling Referential Integrity
˗ Why?
• Linking information between documents or tables
• No support for cascading deletes
˗ Concern?
• Challenge for correctness and consistency
It may be OK for the system to have some extra or missing links, as long as they get
corrected within the given period of time.
Why do we get referential integrity issues? Frequently, it may be the result of deleting a
piece of information from the document. for example, without deleting the references to it.
A. Change Streams
B. Single Document