MongoDB Case Study 1
MongoDB Case Study 1
-Pranav Nayak
17070124045
Abstract
With the advent of time more and more information is being processed on a daily basis, with
such a big volume of data growing tremendously every day, maintaining and storing it is
difficult. So to solve this problem, a database management system is used. There are many
databases, mainly relational and non-relational. Relational databases work with structured data,
whereas non-relational work with semi-structured data. MongoDB is an example of a non-
relational database.
Introduction
There are many databases commonly, relational and non-relational. Relational databases usually
work with structured data and non-relational databases work with semi structured data.
MongoDB is an example of a non-relational database. Before non-relational databases, relational
databases were widely used but it was not able to provide horizontal scalability (ability to
distribute the data and read/write operation amongst the servers). To overcome this problem
NOSQL like MongoDB were used. NOSQL is now used in many fields like social networks,
search engines, data warehousing and caching, etc. Although, MongoDB is new but it is open
source and is efficient .It is also being used in many modern database projects and principles. We
can modify the structure by adding or removing attributes.
1) Data Model: It has a key value data model which means each value has a key which
increases the query speed. NOSQL is now used in many fields like social networks,
search engines, geospatial analysis, molecular modeling, data warehousing and caching.
2) It can read and write data fast and supports massive storage capacity.
MongoDB
MongoDB is an open source which stores data in JSON like format. Called BSON format.
BSON supports integer, date, string, etc. It doesn’t use joins like relational databases. It is a
document-oriented database which has an arrangement of collections. MongoDB supports
Master Slave replication in which slave nodes contain the replicas of master nodes which are
then used for backups. The read/ write operation is done by using the data present at the master
node. It is responsible for serving the task requests from the clients. MongoDB contains many
records and each record has a field which is uniquely identified by ID field.
Features of MongoDB
● Flexibility: It stores data in document format using JSON and is schema less document.
● Sharding : It allows us to scale our cluster linearly by adding more machines and
increasing the efficiency.
● High Performance: By using embedded documents, it reduces I/O activity on the
database.
● High availability: MongoDB support replication facility called replica set. Replica set is a
group of servers that maintains the same dataset.
● Rich Query language : It gives RDBMS features like Dynamic queries, sorting,
secondary indexes, rich updates, easy aggregation, upset
Architecture of MongoDB
Single instance is supported by MongoDB. Replica sets provide high performance of replication
with automated failure. Large data sets are divided over different machines by Sharded Clusters,
which are transparent to users.
MongoDB supports sharding through the configuration of a sharded cluster. Sharded cluster has
three components: Shards, Config servers and Query Routers.
● Shards: They are used to store the data which is used to provide high availability and data
consistency.
● Config servers: Mapping of the cluster’s shards is contained in this data.
● Query Routers: It processes and targets operations to shards and then returns results to the
clients. Most sharded clusters can have many query routers.
1. Create command
2. Insert command
db.collection_name. insert (
3. Delete command
db.collection_name.rem ove
({condition})
4. Select command
db.collection_name({}, {condition})
5. Update command
db.collection.updateOne()
db.collection.updateMany()
db.collection.replaceOne()
MongoDB Aggregation Framework
Aggregation operations process data records and return computed results. It groups data from
multiple documents and operates in many ways on grouped data to return 1 combined result.
Aggregation can be used to apply sequence of query operations to documents in a collection. In
Mongo, we can pipe a collection into top and transform it through it through a series of
operations.
There are 3 ways to perform aggregation using MongoDB, i.e., aggregation pipeline, map-reduce
function, and single purpose.
It was one of the first business publications in the world to do such an innovative thing.
The original digital transformation.
In the 25 years since Forbes has only accelerated its efforts and is widely considered to
set the standard for digital innovation in the publishing industry. The 100-year-old
publisher, famous for its business journalism and rich-lists, has become the largest
business media brand in the world. It reaches more than 140 million people worldwide
every month, across a number of online and offline channels.
In just six months, Forbes migrated its platform to Google Cloud and MongoDB Atlas.
Results include:
During the pandemic the cloud infrastructure has also helped the website scale to an
extraordinary number of users and helped the team stay nimble, introducing and testing
a number of new features
February of 2020 and the COVID-19 pandemic is the biggest story of a generation and
a crisis for almost every business. Forbes had not been idle in those intervening years.
Vadim had insisted on an 'aggressive timeline'. The first stage of the cloud migration
had already finished in late 2019 and had taken just six months to complete.
The centerpiece of which was a move to the cloud database service MongoDB Atlas,
hosted on Google Cloud. But before they pushed everything live, they did something
that not enough companies do: Test, test, test.
It was during that load testing and Quality Assurance (QA) phase that Forbes
discovered a critical dependency: There was unacceptably high latency between the
datacenter and the cloud. The round trip for data access would have been so slow that
the resulting multiplier effect would have created a terrible user experience. To solve
this they architected a phased rollout by breaking down the service transfer so that the
core applications and databases all moved in one shot.
Once in place, the team used the new infrastructure to create an abstraction layer so
that most services don't even directly touch the database. Instead Forbes makes use of
an intermediate service, called the Content API. The API provides a stable API on top of
the more fluid data structures hosted within MongoDB Atlas. This uncouples the format
of the data from the requirements of the services using it. Services are no longer bound
to the data schema. Make a change to one data structure in one place and it doesn’t
break anything (or anyone) elsewhere in the stack.
Forbes current architecture – June 2020
Conclusion
The demand for NoSQL databases like MongoDB has gone up in recent times. The growing data
has given importance to NoSQL databases which can structured, unstructured and semi
structured data. Document data stores are easy to use and dynamic data can be easily stored into
them. Documents are independent so it improves the performance and decreases concurrency.
MongoDB is an open source tool for this category of database.