U5 01 MongoDB
U5 01 MongoDB
UNIT-V
MongoDB
Mongo DB- https://www.mongodb.com/home
Mongo DB
MongoDB (Humongous) is an open-source document database that
provides high performance, high availability, and automatic scaling.
MongoDB internally used Mozilla's Spider Monkey Java Script
Engine.
There are 2 most common types of databases.
Not suitable for hierarchical data storage. Suitable for hierarchical data storage.
It is row-based. It is document-based.
It is slower in comparison with MongoDB. It is almost 100 times faster than RDBMS.
6. We can store very huge amount of data and hence scalability is more.
Mongo DB
JSON BSON
Type JSON files are written in text format. BSON files are written in binary.
BSON is slow to read but faster to build and
Speed JSON is fast to read but slower to build.
scan.
Space JSON data is slightly smaller in byte size. BSON data is slightly larger in byte size.
We can send JSON through APIs without BSON files are encoded before storing and
Encode and Decode
encoding and decoding. decoded before displaying.
JSON is a human-readable format that doesn't BSON needs to be parsed as they are machine-
Parse
require parsing. generated and not human-readable.
JSON has a specific set of data types—string, Unlike JSON, BSON offers additional data
Data Types boolean, number for numeric data types, types such as bindata for binary data,
array, object, and null. decimal128 for numeric.
Used to send data through the network
Usage Databases use BSON to store data.
(mostly through APIs).
Mongo DB
MongoDB Shell vs MongoDB Server:
C --->Create
R --->Retrieve
U --->Update
D --->Delete
Mongo DB
MongoDB Installation:
https://www.mongodb.com/try/download/community
By default, MongoDB listens for connections from clients on port 27017 , and stores
Total Characters
•Total Length: 12 bytes × 2 characters per byte = 24 characters in the hexadecimal string.
Mongo DB Data Types
String: This is the most commonly used data type in MongoDB to store
data, BSON strings are of UTF-8.
So, the drivers for each programming language convert from the string
format of the language to UTF-8 while serializing and de-serializing
BSON.
The string must be a valid UTF-8.
Mongo DB
Integer: In MongoDB, the integer data type is used to store an integer value.
We can store integer data type in two forms 32 -bit signed integer and 64 –
bit signed integer.
Mongo DB
Double: The double data type is used to store the floating-point values.
Mongo DB
Boolean: The Boolean data type is used to store either true or false.
Mongo DB
Null: The null data type is used to store the null value.
Mongo DB
Array: The Array is the set of values. It can store the same or different
data types values in it.
In MongoDB, the array is created using square brackets([]).
Mongo DB
Object: Object data type stores embedded documents.
Date: Date data type stores date. It is a 64-bit integer which represents the
number of milliseconds.
new ISODate(): It also returns a date object. Uses the ISODate() wrapper.
Mongo DB
Creation of Database and Collection
Database won't be created at the beginning and it will be created dynamically.
Whenever we are creating collection or inserting document then database will be
created dynamically.
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
> use it3
switched to db it3
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
How to create collection
db.createCollection("employees")
> Use it3
>show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
> db.createCollection("employees")
{ "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
it3 0.000GB
local 0.000GB
> show collections employees
How to drop collection?
db.collection.drop()
db.students.drop()
employees
students
> db.students.drop()
true
employees
How to drop database?
db.dropDatabase() - current database will be deleted.
Note: db.getName() to know current database.
> show dbs
admin 0.000GB
config 0.000GB
it3 0.000GB
local 0.000GB
> db.dropDatabase()
{ "dropped" : “it3", "ok" : 1 }
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
Basic CRUD Operations
C--->Create|Insert document
db.employees.insertOne({...})
db.employees.insertOne({eno:100,ename:"Sunny",esal:1000,eaddr:"Hyd"})
Basic CRUD Operations
C--->Create|Insert document
db.collection.insertMany([{..},{..},{..},{..}])
C--->Create|Insert document
db.employees.insertOne(emp)
db.employees.insertMany([emp])
db.employees.insert(emp)
db.employees.insert([emp])
Basic CRUD Operations
Inserting Documents from java script file:
studentsdb --->database
students--->collection
students.js:
db.students.insertOne({name: “Karthick", rollno: 101, marks: 98 })
db.students.insertOne({name: "Ravi", rollno: 102, marks: 99 })
db.students.insertOne({name: "Shiva", rollno: 103, marks: 100 })
db.students.insertOne({name: "Pavan", rollno: 104, marks: 80 })
load("D:\students.js")
> show collections
> load("D:\students.js")
true
> show collections
students
Basic CRUD Operations
R--->Read / Retrieval Operation
db.collection.find() --->To get all documents present in the given collection.
db.collection.findOne() --->To get one document.
eg: db.employees.find()
> db.employees.find()
{ "_id" : ObjectId("5fe16d547789dad6d1278927"), "eno" : 100, "ename" : "Sunny", "esal" : 1000, "eaddr" : "Hyd" }
{ "_id" : ObjectId("5fe16da07789dad6d1278928"), "eno" : 200, "ename" : "Bunny", "esal" : 2000, "eaddr" : "Mumbai" }
{ "_id" : ObjectId("5fe16dc67789dad6d1278929"), "eno" : 300, "ename" : "Chinny", "esal" : 3000, "eaddr" : "Chennai" }
{ "_id" : ObjectId("5fe16ddb7789dad6d127892a"), "eno" : 400, "ename" : "Vinny", "esal" : 4000, "eaddr" : "Delhi" }
>db.employees.find().pretty()
Basic CRUD Operations
U-->Update Operation
db.collection.updateOne()
db.collection.updateMany()
db.collection.replaceOne()
If the field is available then old value will be replaced with new value.
If the field is not already available then it will be created.
The update operation document must contain atomic operators.
>db.employees.updateOne({ename: "Vinny"},{$set: {esal:10000}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
Note: If anything prefixed with $ symbol, then it is predefined word in MongoDB
Basic CRUD Operations
D -->Delete
db.collection.deleteOne()
db.collection.deleteMany()
db.employees.deleteOne({ename:"Vinny"})
What is capped collection?
If size exceeds or maximum number of documents reached, then oldest entry will be
deleted automatically, such type of collections are called capped collections.
CRUD Capped Collections
> use it3
> db.createCollection("employees")
db.createCollection(name)
db.createCollection(name,options)
//capped
Max 1000 documents --->1001 document
size: 3736578 bytes only --->if space completed
"errmsg" : "the 'capped' field needs to be true when either the 'size' or 'max' fields are present“
Data modeling is the process of creating a clean data model of how you will store
data in a database.
The goal of data modeling is to identify all the data components of a system, how
they are connected, and what are the best ways to represent these relationships.
MongoDB Data Modelling & Schema
Data models consist of the following components:
Tangible Entity : Entities that exist in the real world physically. Example: Person, car,
etc.
Intangible Entity : Entities that exist only logically and have no physical existence.
Example: Bank Account, etc.
In document databases, each document is an entity. In tabular databases, each row is
an entity.
MongoDB Data Modelling & Schema
Entity types: The categories used to group entities. For example, the
book entity with the title “Alice in Wonderland” belongs to the entity
type “book.”
Attributes—The characteristics of an entity. For example, the entity
“book” has the attributes ISBN (String) and title (String).
Relationships—define the connections between the entities. For
example, one user can borrow many books at a time. The relationship
between the entities "users" and "books" is one to many.
MongoDB Data Modelling & Schema
One to one (1-1): In this type, one value is associated with only one document—for
example, a book ISBN. Each book can have only one ISBN.
One to many (1-N): Here, one value can be associated with more than one document or
value. For example, a user can borrow more than one book at a time.
Many to many (N-N): In this type of model, multiple documents can be associated with
each other. For example, a book can have many authors, and one author can write many
different books.
MongoDB Data Modelling & Schema
MongoDB Data Modelling & Schema
Advantages of Data Modeling:
Ensures better database planning, design, and implementation, leading to
improved application performance.
Promotes faster application development through easier object mapping.
Better discovery, standardization, and documentation of multiple data
sources.
Allows organizations to think of long-term solutions and model data
considering not only current projects, but also future requirements of the
application.
What are the (three) different types of data models?
Conceptual Data Model: The conceptual data model explains what the
Logical Data Model: The logical data model will describe how the data will
be structured.
Physical Data Model: The physical data model represents how the data will
Data in MongoDB has a flexible schema for the documents in the same
collection.
They do not need to have the same set of fields or structure Common
In this model, you can have (embed) all the related data in a single
https://www.mongodb.com/try/download/database-tools
C:\Program Files\MongoDB\Server\4.4\bin
Note:
getIndexes()
MongoDB Indexes
Example – 1 (unique)
MongoDB Indexes
Example – 2 (background)
MongoDB Indexes
Example – 3 (Custom)
MongoDB Indexes
Drop Indexes
Deletes specified indexes on a collection.
Syntax: db.Collection_Name.dropIndex(“name of the index”)
Example: db.collection_Name.dropIndex({key:1})
Aggregation pipelines, which are the preferred method for performing aggregations.
Single purpose aggregation methods, which are simple but lack the capabilities of an aggregation
pipeline.
MongoDB Aggregation
Single Purpose Aggregation Methods
The single purpose aggregation methods aggregate documents from a single
collection. The methods are simple but lack the capabilities of an aggregation
pipeline.
Method Description
Aggregation in MongoDB
Aggregation Pipelines
aggregate() method can perform some processing and provide results in our
customized format.
But find() method will always provide data as it is without performing any
processing and in the existing format only.
>To find total salary of all employees?
>db.employees.aggregate([
{ $group: {_id:null,totalsalary:{$sum:"$esal"}}}
avg: It calculates the average of all given values from all documents
db.employees.insertOne({eno:200,ename:“Ani",esal:2000,eaddr:"Hyderabad"})
db.employees.insertOne({eno:300,ename:“Varun",esal:3000,eaddr:"Hyderabad"})
db.employees.insertOne({eno:400,ename:“Vickram",esal:4000,eaddr:"Mumbai"})
db.employees.insertOne({eno:500,ename:“Ramesh",esal:5000,eaddr:"Chennai"})
db.employees.insertOne({eno:600,ename:“Kannan",esal:6000,eaddr:"Chennai"})
db.employees.insertOne({eno:700,ename:“Rajesh",esal:7000,eaddr:"Hyderabad"})
MongoDB Aggregation
Ex-1: To find total salary of all employees?
>db.employees.aggregate([
{ $group: {_id:null,totalsalary:{$sum:"$esal"}}}
]) { "_id" : null, "totalsalary" : 28000 }
Ex-1: To find average salary of all employees?
>db.employees.aggregate([
{ $group: {_id:null,averagesalary:{$avg:"$esal"}}}
]) { "_id" : null, "averagesalary" : 4000 }
Ex-1: To find max salary of all employees?
>db.employees.aggregate([
{ $group: {_id:null,maxsalary:{$max:"$esal"}}}
]) { "_id" : null, "maxsalary" : 7000 }
MongoDB Aggregation
Ex-5: To find max salary city wise?
>db.employees.aggregate([
{$group: {_id:"$eaddr",maxSalary:{$max:"$esal"}}}
])
o/p:
{ "_id" : "Mumbai", "maxSalary" : 4000 }
{ "_id" : "Hyderabad", "maxSalary" : 7000 }
{ "_id" : "Chennai", "maxSalary" : 6000 }
Ex-5: To find total number of employees? **For every document add 1 to the employeecount.
db.employees.aggregate([
{$group: {_id:null,employeecount:{$sum:1}}}
]) { "_id" : null, "employeecount" : 7 }
MongoDB Aggregation Pipeline
MongoDB Aggregation Pipeline
1-Find city wise sum of salaries and print based on descending order of total salary?
>db.employees.aggregate([
{ $group: {_id:"$eaddr",totalSalary:{$sum:"$esal"}}},
{ $sort:{totalSalary: -1}}
])
Output:
{ "_id" : "Hyderabad", "totalSalary" : 12000 }
{ "_id" : "Chennai", "totalSalary" : 11000 }
{ "_id" : "Mumbai", "totalSalary" : 5000 }
The <sort order> can be either 1 or -1.
1 --->Ascending Order
-1 ---> Descending Order
MongoDB Aggregation Pipeline
Find city wise number of employees and print based on alphabetical order of city name?
>db.employees.aggregate([
{ $group: {_id:"$eaddr", employeeCount: {$sum:1}}},
{ $sort: {_id:1}}
])
Output:
{ "_id" : "Chennai", "employeeCount" : 2 }
{ "_id" : "Hyderabad", "employeeCount" : 3 }
{ "_id" : "Mumbai", "employeeCount" : 2 }
MongoDB Aggregation Pipeline
$project stage: { $project: { field:0|1 } }
0 or false --->To exclude the field 1 or true --->To include the field
To find total salary of all employees?
db.employees.aggregate([
{ $project: {_id:0}}
])
To find the number of employees whose salary greater than 1500. Find such employees count city
wise. Display documents in ascending order of employee count.
db.employees.aggregate([
{$match: {esal:{$gt: 1500}}},
{$group: {_id:"$eaddr",employeeCount:{$sum:1}}},
{$sort: {employeeCount:1}}
]) Output: { "_id" : "Mumbai", "employeeCount" : 1 }
{ "_id" : "Chennai", "employeeCount" : 2 }
{ "_id" : "Hyderabad", "employeeCount" : 3 }
MongoDB Aggregation Pipeline
$limit stage: It limits the number of documents passed to the next stage in the pipeline.
Syntax: { $limit: <positive integer> }
To find the number of employees whose salary greater than 1500. Find such employees count city
wise. Display documents in ascending order of employee count. But only display 2 documents.
>db.employees.aggregate([
{ $match: {esal: {$gt: 1500}}},
{ $group: {_id:"$eaddr", employeeCount: {$sum:1}}},
{ $sort: { employeeCount: 1}},
{ $limit: 2}
])
Output:
{ "_id" : "Mumbai", "employeeCount" : 1 }
{ "_id" : "Chennai", "employeeCount" : 2 }
MongoDB Aggregation Pipeline
$skip stage: $skip takes a positive integer that specifies the maximum number of documents to skip.
Syntax: { $skip: <positive integer> }
To find the number of employees whose salary greater than 1500. Find such employees count city
wise. Display documents in ascending order of employee count. Skip first 2 documents and display
only remaining documents?
db.employees.aggregate([
{ $match: {esal: {$gt: 1500}}},
{ $group: {_id:"$eaddr", employeeCount: {$sum:1}}},
{ $sort: { employeeCount: 1}},
{ $skip: 2}
])
Output:
{ "_id" : "Hyderabad", "employeeCount" : 3 }
MongoDB Aggregation Pipeline
$out stage:
It takes the documents returned by the aggregation pipeline and writes them to a specified
collection.
{ $out: "collectionName" }
To find the number of employees whose salary greater than 1500. Find such employees
count city wise. Rearrange documents in ascending order of employee count. Write result to
cityWiseEmployeeCount collection?
MongoDB Aggregation Pipeline
To find the number of employees whose salary greater than 1500. Find such employees
count city wise. Rearrange documents in ascending order of employee count. Write
result to cityWiseEmployeeCount collection?
db.employees.aggregate([
{ $out: "cityWiseEmployeeCount"}
])
Replication in MongoDB
A replica set in MongoDB is a group of mongod processes that maintain the same
data set.
Replica sets provide redundancy and high availability, and are the basis for all
production deployments.
A replica set contains several data bearing nodes and optionally one arbiter node.
Of the data bearing nodes, one and only one member is deemed the primary node,
27017
27018 27019
Why do you need Replication?
Replication of data means and it makes data available all the time.
Keeping copy data allow users to recover data from any of the secondary server if
If there is any server with system failure or downtime for maintenance or indexing
Each mongod command starts a MongoDB instance on a specified port with the same replica
set name m101 and different data directories and log files.
The config object defines the replica set ID and its members (the three MongoDB instances).
The rs.initiate(config) command initializes the replica set with this configuration.
Inserting Data:
A document is inserted into the College collection in the primary node. This data will be
replicated to the secondary nodes.
Replication in MongoDB
Querying Data on Secondaries:
Checking Status:
The rs.status() command provides the current status of the replica set, showing which node is
To start the mongod instance, specify the port value for your Mongo instance along
with the path to your MongoDB installation on your system.
A Replica Set contains multiple instances that communicate with each other. To establish
communication between them, you need to specify the hostname/localhost along with
their IPs as follows.
};
Replication in MongoDB
Insert and Query Data:
db.College.find();
TERMINAL1:
use admin;
db.shutdownServer();
Replication in MongoDB
To check the status of the replication, you can use the status command as follows:
rs.status()
You can test the process by adding a document in the primary node. If replication is
working properly, the document will automatically be copied into the secondary node.
shard: Each shard contains a subset of the sharded data. Each shard must be deployed
as a replica set.
mongos: The mongos acts as a query router, providing an interface between client
config servers: Config servers store metadata and configuration settings for the
nodes also called shards. MongoDB uses sharding to deploy large datasets and
The challenges with database servers which are having larger datasets are high query
rate processing that exhausted the CPU capacity of the server, the working set size
exceeds the physical memory and the read/write throughput exceeds I/O operation.
Sharding in MongoDB
Vertical Scaling involves increasing the capacity of a single server, such as using a
more powerful CPU, adding more RAM, or increasing the amount of storage space
Horizontal Scaling involves dividing the system dataset and load over multiple
servers, adding additional servers to increase capacity as required. MongoDB
supports horizontal scaling through sharding.