FSD Unit III
FSD Unit III
UNIT - III
MongoDB:
• The backend data store is responsible for storing everything from user account
information to shopping cart items to blog and comment data.
• Good web applications must store and retrieve data with accuracy, speed, and
reliability. Therefore, the data storage mechanism must perform at a level that
satisfies user demand.
• Several different data storage solutions are available to store and retrieve data
needed by web applications.
• The three most common are direct file system storage in files, relational databases,
and NoSQL databases.
Need of NoSQL
• The concept of NoSQL (Not Only SQL) consists of technologies that provide
storage and retrieval without the tightly constrained models of traditional SQL
relational databases.
• The motivation behind NoSQL is mainly simplified designs, horizontal scaling,
and finer control of the availability of data.
• NoSQL breaks away from the traditional structure of relational databases and
allows developers to implement models in ways that more closely fit the data
flow needs of their systems.
• This allows NoSQL databases to be implemented in ways that traditional
relational databases could never be structured.
• There are several different NoSQL technologies, such as HBase’s column
structure, Redis’s key/value structure, and Neo4j’s graph structure.
• MongoDB and the document model were chosen because of great flexibility and
scalability when it comes to implementing backend storage for web applications
and services.
• MongoDB is one of the most popular and well supported NoSQL databases
Understanding MongoDB
MongoDB
• MongoDB is a NoSQL database based on a document model where data objects
are stored as separate documents inside a collection.
• The motivation of the MongoDB language is to implement a data store that
provides high performance, high availability, and automatic scaling.
• MongoDB is simple to install and implement
Understanding Collections
• MongoDB groups data together through collections.
• A collection is simply a grouping of documents that have the same or a similar
purpose.
• A collection acts similarly to a table in a traditional SQL database, with one major
difference.
• In MongoDB, a collection is not enforced by a strict schema; instead, documents in
a collection can have a slightly different structure from one another as needed.
This reduces the need to break items in a document into several different tables,
which is often done in SQL implementations.
Understanding
Understanding Documents
MongoDB
• A document is a representation of a single entity of data in the MongoDB database.
• A collection is made up of one or more related objects. A major difference between
MongoDB and SQL is that documents are different from rows. Row data is flat, meaning
there is one column for each value in the row. In MongoDB, documents can contain
embedded subdocuments, thus providing a much closer inherent data model to your
applications.
• The records in MongoDB that represent documents are stored as BSON, which is a
lightweight binary form of JSON, with field:value pairs corresponding to JavaScript
property:value pairs. These field:value pairs define the values stored in the document.
For example, a document in MongoDB may be structured with the following fields:
{
name: "New Project",
version: 1,
languages: ["JavaScript", "HTML", "CSS"],
admin: {name: “CSE", password: ""},
paths: {temp: "/tmp", project: "/opt/project", html: "/opt/project/html"}
MongoDB Data Types
The document structure contains fields/properties that are strings, integers, arrays,
and objects.
The field names cannot contain null characters, . (dots), or $ (dollar signs). Also, the
_id field name is reserved for the Object ID. The _id field is a unique ID for the system
that consists of the following parts:
• A 4-byte value representing the seconds since the last epoch
• A 3-byte machine identifier
• A 2-byte process ID
• A 3-byte counter, starting with a random value
The maximum size of a document in MongoDB is 16MB
MongoDB Data Types
• The BSON data format provides several different types that are used when storing
the JavaScript objects to binary form. These types match the JavaScript type as
closely as possible.
• MongoDB assigns each of the data types an integer ID number from 1 to 255 that is
Type Number
Double 1
MongoDB Data Types
String 2
Object 3
Array 4
Binary data 5
Object id 7
Boolean 8
Date 9 MongoDB data types and corresponding ID number
Null 10
Regular Expression 11
JavaScript 13
JavaScript (with scope) 15
32-bit integer 16
Timestamp 17
64-bit integer 18
Decimal126 19
Min key -1
Max key 127
Planning Your Data Model
• Before you begin implementing a MongoDB database, you need to understand the
nature of the data being stored, how that data is going to get stored, and how it is
going to be accessed.
• What are the basic objects that my application will be using?
• What is the relationship between the different object types: one-to-one, one-
tomany,
• or many-to-many?
• How often will new objects be added to the database?
• How often will objects be deleted from the database?
• How often will objects be changed?
• How often will objects be accessed?
• How will objects be accessed: by ID, property values, comparisons, and so on?
• How will groups of object types be accessed: by common ID, common property
• value, and so on?
Normalizing Data with Document
References
• Data normalization is the process of organizing documents and collections to
minimize redundancy and dependency.
• Typically, this is used for objects that have a one-to many or many-to-many
relationship with subobjects.
• The advantage of normalizing data is that the database size will be smaller
because only a single copy of an object will exist in its own collection instead of
being duplicated on multiple objects in a single collection.
• Also, if you modify the information in the subobject frequently, you only need to
modify a single instance rather than every record in the object’s collection that
has that subobject.
• Major Disadvantage of Normalizing Data: Performance Hit
• When you normalize data in MongoDB (or any database), you separate
related information into different collections.For example:
• You store user info in the Users collection And store info in the FavoriteStores
collection .The Users collection just has a reference ID to link to the store
• If you want to see:
"Alice's name and her favorite store name and address"
MongoDB has to do two jobs:
• Look up Alice’s record in the Users collection
• Then use the favoriteStore ID to go fetch that store’s info from FavoriteStores
This second step is called a lookup (or a join in SQL terms).
If your app needs to show full user info very frequently, these extra lookups:
• Take more time, Use more memory, Slow down performance, especially if
there are thousands or millions of users
Denormalizing Data with Embedded
Documents
Denormalizing data means finding smaller parts of a main object and storing them directly inside that
main object’s document.
this is done on objects that have a mostly one-to-one relationship or are relatively small and do not get
updated frequently.
The major advantage of denormalized documents is that you can get the full object back in a single
lookup without the need to do additional lookups to combine subobjects from other collections.
Since all the information is already packed together, MongoDB doesn’t need to do extra lookups. This
means: Faster performance ,Fewer queries, You get everything in one go
Main Downside: More Space & Slower Writes
If many users share the same sub-data (like a company’s contact info), you are copying that data into
every user document.
More disk space used Slower insert/update operations (because the same data lives in many places)
eg)Let’s say you have a User who has both home and work contact info.
You could store it like this (denormalized – all in one document):
home and work are both embedded inside the user.
No separate collections. No reference IDs. Everything is right there.
capped collection,Understanding Atomic Write
Operations-textbook
• ADVANTAGES
• Capped collections keep documents in the same order they were
added.
• You don’t need an index to get documents in the order they were
stored — this saves extra work for the database.
• They don’t allow updates that make documents bigger, so the
documents stay in the same place on disk.
• This avoids the extra effort of moving documents around and tracking
their new locations.
When you update a document, think about whether the new data will make it
bigger.
MongoDB gives some extra space (padding) to handle small changes.
But if the document grows too much, MongoDB must move it to a new place on
the disk.
This slows down performance and can cause disk fragmentation (messy
storage).
Example: If you keep adding items to an array, the document might grow too
big.
To avoid this:
Use normalized objects for parts that grow a lot.
Instead of putting all cart items in an array inside a Cart document,
Create a separate CartItems collection.
Each cart item is a new document linked to the user's cart.
Indexes make frequent searches faster by creating a quick lookup system.
MongoDB automatically creates an index on the \_id field because it's commonly
used to find data.
You should also create extra indexes based on how users search your data.
Sharding means splitting big collections across different MongoDB servers (called
shards).
This helps handle large amounts of data and traffic, improving performance by
sharing the load (horizontal scaling).
Use sharding if your data is too big or gets lots of requests.
1. In the Connection URL You can also connect without giving username and password in the
You can provide `username`, `password`, and `database` URL, and instead use `.authenticate()` after connecting:
right in the URL like this: client.connect(
client.connect( 'mongodb://localhost:27017',
{ poolSize: 5, reconnectInterval: 500 },
'mongodb://dbadmin:test@localhost:27017/testDB', function(err, db) {
{ poolSize: 5, reconnectInterval: 500 }, if (err) {
console.log("Failed");
function(err, db) {
} else {
if (err) console.log("Failed"); const testDB = db.db("testDB");
else { testDB.authenticate("dbadmin", "test", function(err, result) {
if (err) {
console.log("Connected!");
console.log("Authentication Failed");
db.logout(() => { db.close();
console.log("Logged out"); } else { Output
console.log("Authenticated!"); Connected Via Client
db.close(); db.logout(() => { Object ...
}); console.log("Logged out"); Authenticated Via
} db.close(); Client Object ...
}); Logged out Via Client
} } Object ...
); }); Connection closed ...