Mongodb Report
Mongodb Report
1
Contents
1 Abstract 3
2 Objectives 4
3 Literature Review 5
3.1 What is NoSQL Database? . . . . . . . . . . . . . . . . . . . . 5
3.2 Types of NoSQL Databases . . . . . . . . . . . . . . . . . . . 6
3.3 SQL vs NoSQL . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 When to Use NoSQL? . . . . . . . . . . . . . . . . . . . . . . 7
4 Introduction 9
4.1 What is MongoDB? . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Features of MongoDB . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 Who is using MongoDB? . . . . . . . . . . . . . . . . . . . . . 10
5 Implementation 11
5.1 MongoDB Installation . . . . . . . . . . . . . . . . . . . . . . 11
5.2 MongoDB Queries and Commands . . . . . . . . . . . . . . . 12
5.3 Collection Commands . . . . . . . . . . . . . . . . . . . . . . 13
5.4 CRUD Operations (Create, Read, Update, Delete) . . . . . . . 13
5.5 Indexing Commands . . . . . . . . . . . . . . . . . . . . . . . 14
5.6 Aggregation Commands . . . . . . . . . . . . . . . . . . . . . 15
5.7 User and Role Management . . . . . . . . . . . . . . . . . . . 15
5.8 Backup and Restore Commands . . . . . . . . . . . . . . . . . 16
5.9 Server Status & Monitoring . . . . . . . . . . . . . . . . . . . 16
5.10 MongoDB Compass & Atlas . . . . . . . . . . . . . . . . . . . 17
6 Query Execution 18
6.1 Placement Data Analysis . . . . . . . . . . . . . . . . . . . . . 18
6.2 COVID-19 Data Analysis . . . . . . . . . . . . . . . . . . . . . 20
8 Conclusion 25
2
1 Abstract
MongoDB is a NoSQL database that provides high scalability, flexibility, and
performance for handling unstructured and semi-structured data. Unlike
traditional relational databases, MongoDB uses a document-oriented model,
storing data in BSON (Binary JSON) format, which enables faster data
retrieval and horizontal scaling. This project explores the core architecture
of MongoDB, including database creation, collection management, CRUD
operations, indexing, aggregation, replication, and sharding.
The project also highlights the advantages of MongoDB over SQL databases,
particularly in big data applications, real-time analytics, and cloud-based
systems. Challenges such as schema design complexity, high memory con-
sumption, and limited support for complex transactions are discussed, along
with potential solutions. By implementing practical use cases and perfor-
mance analysis, this project demonstrates MongoDB’s effectiveness in mod-
ern database management. Future enhancements could include AI-driven an-
alytics integration and further optimization techniques. Overall, MongoDB
is a robust solution for applications requiring high availability, scalability,
and flexibility in data storage and management.
3
2 Objectives
Understanding:
• What is NoSQL?
• Types of NoSQL
• What is MongoDB?
• Features of MongoDB
• MongoDB Installation
• MongoDB Querying
4
3 Literature Review
3.1 What is NoSQL Database?
NoSQL databases refer to ’Non-relational’ or ’Not only SQL’ databases, pro-
viding a mechanism for data storage and retrieval in a non-tabular format.
They are used for big data and real-time applications. Features include
flexible schema, absence of joins, and easy scalability. They are a type of
databases designed to handle large volumes of unstructured, semi-structured,
or structured data. Unlike traditional relational databases (SQL databases)
that use tables and fixed schemas, NoSQL databases offer flexible schemas,
horizontal scalability, and high availability, making them ideal for big data
applications, real-time web apps, and distributed systems.
• Caching & Session Storage (e.g., Redis for high-speed key-value stor-
age)
5
3.2 Types of NoSQL Databases
1. Document Stores (e.g., MongoDB, CouchDB) – Store data as JSON-
like documents, making them suitable for hierarchical and nested data.
6
Feature SQL (Relational) NoSQL (Non-Relational)
Data Model Table-based with fixed schemas Document, key-value, column-
family, or graph-based
Schema Rigid, predefined Flexible, dynamic
Scaling Vertical (scale-up) Horizontal (scale-out)
ACID Compli- Full Varies (often sacrificed for perfor-
ance mance)
Query Language SQL (standardized) Database-specific
Consistency Strong consistency Eventual consistency (typically)
Transaction Multi-row, multi-table Limited (though improving)
Support
Performance Optimized for complex queries Optimized for high-volume, sim-
ple operations
Best For Complex queries, transactions Big data, real-time applications,
flexible schemas
Examples MySQL, PostgreSQL, Oracle MongoDB, Cassandra, Redis,
Neo4j
7
8. Social Media and Recommendation Systems – Graph and document-
based NoSQL databases excel at managing complex relationships and
personalization.
9. Real-Time Analytics and Logging – NoSQL is ideal for fast data inges-
tion and analysis in real-time analytics and monitoring applications.
8
4 Introduction
4.1 What is MongoDB?
MongoDB is a NoSQL document-oriented database that stores data in JSON-
like BSON (Binary JSON) format, offering flexibility and scalability. Unlike
relational databases, it does not require a fixed schema, making it ideal for
handling unstructured or semi-structured data. MongoDB supports horizon-
tal scaling through sharding and ensures high availability with replication.
It is widely used in big data, real-time applications, IoT, and content man-
agement systems due to its high-speed performance, flexible data model, and
support for distributed architectures.
MongoDB provides a rich query language that supports indexing, aggre-
gation, and full-text search, making data retrieval efficient. It is widely used
in modern web applications, mobile backends, and real-time analytics. With
features like automatic failover, horizontal scalability, and flexible schema
design, MongoDB is a popular choice for handling large-scale, high-velocity
data. Its replication and sharding mechanisms ensure data availability and
reliability, making it suitable for cloud-based and distributed applications
that require high performance and flexibility.
4. Replication & High Availability – Uses replica sets to ensure data re-
dundancy and automatic failover.
9
7. High Performance – Optimized for fast read and write operations, mak-
ing it suitable for real-time applications.
10
5 Implementation
5.1 MongoDB Installation
5.1.1 Installing MongoDB on Linux (Ubuntu/Debian)
Step 1: Import the MongoDB Repository
1 # Open Terminal and run :
2 curl - fsSL https :// pgp . mongodb . com / server -6.0. asc | sudo
gpg -- dearmor -o / usr / share / keyrings / mongodb - server
-6.0. gpg
3
4 # Install MongoDB :
5 sudo apt install -y mongodb - org
11
MongoDB is available in two editions: Community (Free) and Enterprise
(Paid).
Installation links:
• Community Edition: https://www.mongodb.com/docs/manual/tutorial/
install-mongodb-on-ubuntu/
• Enterprise Edition: https://www.mongodb.com/docs/manual/tutorial/
install-mongodb-enterprise-on-ubuntu/
1 show dbs
Example Output:
1 admin 0.000 GB
2 config 0.000 GB
3 local 0.000 GB
1 use myDatabase
1 db
1 db . dropDatabase ()
12
5.3 Collection Commands
5.3.1 Show Collections (Tables in SQL)
1 show collections
1 db . users . drop ()
Multiple Documents
1 db . users . insertMany ([
2 { name : " Alice " , age : 30 , city : " London " } ,
3 { name : " Bob " , age : 27 , city : " Paris " }
4 ])
13
Find with Condition
1 db . users . find ({ age : 25 })
14
5.5.2 Show Indexes
1 db . users . getIndexes ()
1 db . users . countDocuments ()
1 db . users . aggregate ([
2 { $group : { _id : " $city " , totalUsers : { $sum : 1 } } }
3 ])
1 db . createUser ({
2 user : " adminUser " ,
3 pwd : " password123 " ,
4 roles : [{ role : " readWrite " , db : " myDatabase " }]
5 })
15
Creates a user with read and write permissions.
1 show users
1 db . serverStatus ()
1 db . stats ()
16
5.9.3 Check Collection Stats
1 db . users . stats ()
17
6 Query Execution
6.1 Placement Data Analysis
Refer Placement Data Set.csv file and answer the following questions.
Data Description:
Figure 1: Results of query for Science & Technology students with degree
percentage 75
18
6.1.2 Female Students with Science and Technology Degree
From the above filter students with gender as F. How many observations do
you get?
1 {
2 " degree_t ": " Sci & Tech " ,
3 " degree_p ": { " $gte ": 75 } ,
4 " gender ": " F "
5 }
19
6.1.4 Mean and Median Degree Percentage
Calculate the median of degree p of all students? What is the mean value?
1 [
2 {
3 " $group ": {
4 " _id ": null ,
5 " meanDegreeP ": { " $avg ": " $degree_p " } ,
6 " allValues ": { " $push ": " $degree_p " }
7 }
8 },
9 {
10 " $project ": {
11 " meanDegreeP ": 1 ,
12 " sortedValues ": {
13 " $sortArray ": { " input ": " $allValues " , " sortBy ":
1 }
14 }
15 }
16 },
17 {
18 " $project ": {
19 " meanDegreeP ": 1 ,
20 " medianDegreeP ": {
21 " $arrayElemAt ": [
22 " $sortedValues " ,
23 { " $floor ": { " $divide ": [{ " $size ": "
$sortedValues " } , 2] } }
24 ]
25 }
26 }
27 }
28 ]
20
2 {
3 " $project ": {
4 " yearMonth ": {
5 " $dateToString ": { " format ": "% Y -% m " , " date ": {
" $toDate ": " $Date " } }
6 },
7 " Confirmed ": 1
8 }
9 },
10 {
11 " $group ": {
12 " _id ": " $yearMonth " ,
13 " totalCases ": { " $sum ": " $Confirmed " }
14 }
15 },
16 { " $sort ": { " totalCases ": -1 } } ,
17 { " $limit ": 1 }
18 ]
21
6.2.3 States with High Death Rate
Check for state in which death rate is more than 1%.
1 [
2 {
3 " $group ": {
4 " _id ": " $State / UnionTerritory " ,
5 " totalConfirmed ": { " $sum ": " $Confirmed " } ,
6 " totalDeaths ": { " $sum ": " $Deaths " }
7 }
8 },
9 {
10 " $project ": {
11 " state ": " $_id " ,
12 " deathRate ": {
13 " $multiply ": [{ " $divide ": [" $totalDeaths " , "
$totalConfirmed "] } , 100]
14 }
15 }
16 },
17 { " $match ": { " deathRate ": { " $gt ": 1 } } } ,
18 { " $sort ": { " deathRate ": -1 } }
19 ]
22
7 Challenges and Limitations of MongoDB
7.1 Challenges
• Schema Design Complexity – Unlike relational databases, Mon-
goDB does not enforce a fixed schema, making it challenging to design
efficient document structures, especially for complex relationships.
• Indexing Overhead – While indexing improves query performance,
improper indexing strategies can lead to increased memory usage and
slower writes.
• Memory Consumption – MongoDB stores indexes in RAM for fast
lookups, which can cause high memory consumption, especially for
large datasets.
• Data Redundancy – Due to the denormalized document structure,
data duplication can occur, increasing storage requirements.
• Replication and Failover Management – Setting up replica sets
and ensuring data consistency across multiple nodes requires careful
configuration and monitoring.
• Limited Support for Complex Transactions – Although Mon-
goDB supports multi-document ACID transactions, they are not as op-
timized as those in traditional SQL databases, making complex trans-
actional workflows challenging.
7.2 Limitations
• Lack of Joins – Unlike SQL databases, MongoDB does not support
traditional joins, requiring data embedding or application-side joins,
which can impact performance.
• Write-Heavy Workloads Impact Performance – High-frequency
write operations, especially in large collections, can cause performance
bottlenecks due to journaling and replication overhead.
• Sharding Complexity – Implementing sharding for horizontal scaling
requires careful planning to avoid uneven data distribution (shard key
selection is crucial).
• High Disk Space Usage – BSON format and pre-allocation of storage
lead to higher disk space consumption compared to other databases.
23
• Limited Analytics and Reporting Features – MongoDB is opti-
mized for real-time applications but lacks built-in advanced analytics
features compared to traditional data warehouses.
24
8 Conclusion
MongoDB has proven to be a powerful and flexible NoSQL database, offer-
ing high scalability, performance, and ease of use for handling unstructured
and semi-structured data. Through this project, we explored the core ar-
chitecture of MongoDB, including its document-based data model, indexing
mechanisms, and advanced features such as replication and sharding. The
practical implementation of CRUD operations, aggregation pipelines, and
query optimization demonstrated MongoDB’s efficiency in managing large-
scale applications.
25
References
26