Mongo DB
Mongo DB
Mongo DB Modeling:
There are two types of modeling in Mongo DB
1. Embedded Modeling
Embedded modeling is good to use when there is more relative data and there is one-to-one
relationship between the data. In such cases the change in the relative data is not frequent and
is not dependent. In embedded modeling the relative data is embedded as a sub document
under the main document. You can also say that embedded modeling requires demoralization of
data. For example we have a list of person with their personal information and addresses. If we
go with embedded approach then we don’t have to create the separate collection for the
customer addresses as it’s only relative to the respective customer.
"_id" : ObjectId("5d3ffdc8575e43fc45ef05c7"),
"first_name" : "Customer",
"last_name" : "One",
"email" : "[email protected]",
"work_address" : {
"street" : 1,
"state" : "XYZ",
"country" : "Xyz"
},
"home_address" : {
"street" : 1,
"state" : "XYZ",
"country" : "Xyz"
}
In embedded modeling childe data belongs to parent data that means one-to-one relationship.
Sub documents should be smaller in size and the sub documents should be in proper hierarchy.
In most of the cases embedded modeling is more efficient in performance but use more
memory as documents and sub documents store in same collection. The size of each sub
document should not be more than 16Mb.
2. Reference Modeling
Reference modeling is good to use when the relationship between the collections is one-to-
many. Reference modeling is used when there is a relative data which changes frequent and is
not fully dependent on the other data. In reference modeling the reference id is used as a
reference in the other document. You can also say that reference modeling requires
normalization in data. For example we have list of students which are enrolled in different
course. In this case we have a collection of student and course separately. In students collection
there are different students having their respective documents and a separate collection of
courses in which different courses have respective documents. So each student have a reference
id of courses in their documents. The documents will be look like as below:
Student Collection and documents
_id: 123456,
_id: 782654,
_id: 147852,
title: “Course 1”
_id: 963258,
title: “Course 2”
}
Above is just a one example of reference modeling. Once can use reference modeling in many ways
according to the requirements. Reference modeling is mostly used when the relationship between the
data is one-to-many. Reference modeling is less efficient as compared to embedded modeling but it
saves memory as it does not contains sub documents. If we are using reference modeling then we
should take of the performance as in reference modeling the number of queries can be increased and
we have to make sure that we should design our schema in such way that our query count should be
minimum.
Datatypes:
1. String
a. This is the most commonly used datatype to store the data. String in MongoDB must be
UTF-8 valid.
2. Integer
a. This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending
upon your server.
3. Boolean
a. This type is used to store a Boolean (true/ false) value.
4. Double
a. This type is used to store floating point values.
5. Undefined
a. This MongoDB data type stores the undefined values.
6. Null
a. This MongoDB data types stores a null value in it.
7. Min/ Max keys
a. This type is used to compare a value against the lowest and highest BSON elements.
8. Arrays
a. This type is used to store arrays or list or multiple values into one key.
9. Timestamp
a. This can be handy for recording when a document has been modified or added.
10. Object
a. This datatype is used for embedded documents.
11. Symbol
a. These MongoDB data types similar to the string data type. It is not supported by a shell.
But if the shell gets a symbol from the database, it is converted into strings.
12. Date
a. This datatype is used to store the current date or time in UNIX time format. You can
specify your own date time by creating object of Date and passing day, month, year into
it.
13. Object ID
a. This datatype is used to store the document’s ID.
14. Binary data
a. This datatype is used to store binary data.
15. JavaScript
a. This datatype is used to store JavaScript code into the document.
16. Regular expression
a. This datatype is used to store regular expression.
Profiling:
The mongodb profiler collects the data of all the database queries executed in mondodb instance. There
are different levels of profiling in mongodb
1. 0 - It means that the profiler is disabled and will not collect any data. It’s the default level.
2. 1 - It means that the profiler is enabled but will only get the data for the operations that take
longer than the slowms value.
3. 2 - It means that the profiler is enabled but will get data for all the operations.
To check if the profile is enabled or disabled simply run this command in mongo shell.
db.getProfilingLevel()
If you get 0 value then it means the profile is disabled. To enabled the profile simply run this command
in mongo shell.
db.setProfilingLevel(1, { slowms: 20 })
The first parameter in the above command is showing the profiler level and the slowms is representing
the threshold of the profile in milliseconds. The default threshold is 100 milliseconds. But you can
change that threshold by giving your desired value in milliseconds.
The slowms threshold applies to all databases in a mongod instance. It is used by both the database
profiler and the diagnostic log, and should be set to the highest useful value to avoid performance
degradation.
We can also profile a random sample of slow operations. We can specify the sampleRate to obtain slow
operations while enabling profiler. Use below command to set sampleRate.
The default value of sampleRate is set to 1.0, meaning all slow operations are profiled. When
sampleRate is set between 0 and 1, databases with profiling level 1 will only profile a randomly sampled
percentage of slow operations according to sampleRate.
The above command sets the profiling level for the current database to 1 and sets the profiler to sample
42% of all slow operations.
The database profiler logs information about database operations is in the system.profile collection.
You can get the profiler data by quering system.profile collection according to your requirement. Given
below is one example.
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
Aggregation Framework:
By using aggregation documents we can easily group documents in collections by specific conditions. We
can also add additional fields by using aggregation. Aggregation requests are very fast as compare to
simple queries.
Aggregation Process:
When working in aggregation framework we use the documents in the collections as the input data and
then we perform our desired operations on the input data in different stages and we get refined output
that we can also store in a different collection. Each stage is independent from other stage. The date of
one stage is passes to the other stage. In aggregation order of stages is very important.
aggregate() is the method in mongodb to use aggregate method. This method contains different stages
and give us our desired results. Given below is a basic sudo code of aggregate()
db.<collectionname>.aggregate(
// stage 1
{$<stageoperatior: <field>},
// stage 2
{$<stageoperatior: <field>},
],
)
AllowDiskUse:
All aggregation stages are allowed to use maximum 100 MB of RAM. If the memory exceeds, server will
stop operations and returns the error.
{allowDiskUse: true} is used to tell the mongodb server to write the stages data to temporary files. So in
this case the mongodb server will use temporary files instead of using RAM. This is recommended if we
are using aggregation queries on large data. The default aggregation beviour is set to use the RAM.
Stages:
1. $match
2. $group
3. $count
4. $sort
5. $project
6. $limit
7. $unwind
8. Accumulator Stages
a. $sum
b. $avg
c. $max
d. $min
9. Unary Stages
a. $type
b. $it
c. $and
d. $or
e. $gt
f. $multiply
These unary stages are always used with $project stage or it can also be use with $group
stage if we are using accumulator stages in group.
10. $out
This stage is used to store the output result from aggregation in a separate collection. If the
collection is not present it creates the collection
Indexing:
Indexing is the best way to get results faster than normal. But the key for indexing is that we should
know what type of the data we are fetching through queries and what type of indexing should be create
on fields to get results fast and increase the performance. In aggregation use index for the field which is
using for match or group.
1. Single index
2. Compound index
1. Dataset
2. Generated data
3. Where to use the data
4. Relationship between the data
5. Less queries more data
6. How often the data is changing (more important while indexing)?
7. Documents size should be under 16Mb.
8. Only update the required data instead of updating the whole document
9. Avoid application joins.
If we take care about the above the points while designing our database scheme it will help us to choose
the modeling approach that is best for our schema.
Best Practices:
1. Application and Database should be host on same network.
2. Create authenticated users for database access
3. User best indexing for queries to get better performance
4. Use replica for high availability
5. Regularly backup data
6. Best to keep the number of array elements well below four figures.
https://www.youtube.com/watch?v=4rhKKFbbYT4
https://www.youtube.com/watch?v=wA7ui4l8JBw
https://www.youtube.com/results?search_query=mongodb+indexing+explained+urdu+
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1
https://docs.mongodb.com/manual/tutorial/manage-the-database-profiler/
https://docs.mongodb.com/manual/applications/data-models/
https://www.tutorialspoint.com/mongodb/mongodb_datatype
https://www.infoq.com/articles/Starting-With-MongoDB/
https://devops.com/7-best-practices-new-mongodb-users-know/
https://www.developer.com/db/indexing-tips-for-improving-your-mongodb-performance.html
https://www.tutorialspoint.com/mongodb/mongodb_datatype
https://docs.mongodb.com/manual/tutorial/manage-the-database-profiler/