0% found this document useful (0 votes)
27 views

MongoDB For Data Persistence

Uploaded by

rupali nehe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

MongoDB For Data Persistence

Uploaded by

rupali nehe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

MongoDB for Data Persistence

MongoDB
MongoDB is a document-oriented NoSQL database system that provides high
scalability, flexibility, and performance.
Unlike standard relational databases, MongoDB stores data in a JSON
document structure form.
This makes it easy to operate with dynamic and unstructured data and MongoDB
is an open-source and cross-platform database System.
4.1 Advanced Mongoose Features

4.1.1 Schema validation and middleware

4.1.2 Population, aggregation pipelines, and custom queries

4.1.3 Mongoose with TypeScript for type safety


Mongoose
Mongoose is an Object Data Modeling (ODM) library for MongoDB and Node.js. It
manages relationships between data, provides schema validation, and is used to
translate between objects in code and the representation of those objects in
MongoDB.
4.1.1 Schema validation and middleware
Schema Validation: Mongoose allows you to define schemas that enforce the
structure of documents within a collection. Each schema can specify types, required
fields, default values, and custom validation logic.
import mongoose from 'mongoose';
const userSchema = new mongoose.Schema({
name: { type: String, required: true },
email: { type: String, required: true, unique: true },
age: { type: Number, min: 0 }
});
const User = mongoose.model('User', userSchema);
Middleware:
Mongoose supports middleware (also known as pre and post hooks) that can be executed at various
stages of the document lifecycle.

This is useful for tasks like validation or modifying data before saving it to the database.

userSchema.pre('save', function(next) {

this.email = this.email.toLowerCase(); // Normalize email before saving

next();

});

the pre('save') middleware ensures that the email is always stored in lowercase.
Population, Aggregation Pipelines, and Custom Queries

Population: Mongoose's population feature allows you to reference documents in other collections.

It automatically replaces the specified paths in the document with documents from other collections.

Need of Population: Whenever in the schema of one collection we provide a reference (in any field)

to a document from any other collection, we can use a populate() method to fill the field with that

document.
const userSchema = new mongoose.Schema({
username: String,
email: String
})
const postSchema = new mongoose.Schema({
title: String,
postedBy: {
type: mongoose.Schema.Types.ObjectId,
ref: "User"
}
})
const User = mongoose.model('User', userSchema);
const Post = mongoose.model('Post', postSchema);
module.exports = {
User, Post
}
Aggregation Pipelines

In MongoDB, an aggregation pipeline is a powerful tool for processing and transforming data within a collection.
It consists of a sequence of stages, where each stage performs a specific operation on the documents, and the
output from one stage serves as the input for the next. This enables complex data analysis and manipulation,
similar to SQL's GROUP BY with aggregate functions.

Key Features of the Aggregation Pipeline:

1. Stages: Each stage in the pipeline processes documents in a specific way, either modifying or filtering
them. Common stages include:
○ \$match: Filters documents based on specified conditions (akin to a SQL WHERE clause).
○ \$group: Groups documents by a field and applies aggregation functions like sum, count, or
average (similar to SQL's GROUP BY).
○ \$project: Modifies the structure of documents by including, excluding, or adding fields.
○ \$sort: Orders documents according to specified criteria.
○ \$limit and \$skip: Limits or skips a certain number of documents in the output.
1. Operators: Various operators are used within stages to perform calculations or transformations.
Examples include:
○ Arithmetic operators like \$sum, \$avg, \$min, and \$max.
○ Array operators such as \$push and \$addToSet.
○ Conditional operators like \$cond and \$ifNull.
Mongoose with TypeScript for Type Safety

Using Mongoose with TypeScript enhances type safety by allowing you to define interfaces that represent your
document structure.

This reduces runtime errors due to type mismatches

1. Define an Interface:

interface IUser {

name: string;

email: string;

age?: number;

}
2. Create a Schema:
const userSchema = new mongoose.Schema<IUser>({
name: { type: String, required: true },
email: { type: String, required: true },
age: Number,
});

3. Create a Model:
const UserModel = mongoose.model<IUser>('User', userSchema)

4.Using the Model:


async function createUser(userData: IUser) {
const user = new UserModel(userData);
await user.save();
}
// Example usage
createUser({ name: 'Alice', email: '[email protected]' });
4.2 Database Authentication and Security

User Authentication is the process of verifying the identity of a user attempting to access a database.
Authorization determines what an authenticated user is allowed to do.
Key Aspects of Authentication in MongoDB:

● User-Based Authentication: MongoDB requires users to authenticate before they can interact with the
database. Each user has a unique username and password, and they must authenticate using these
credentials.
● Default Admin User: After deploying MongoDB, a user with the role root is typically created in the
admin database. This user has full access to all databases and actions. Without this, MongoDB would be
vulnerable to unauthorized access.
MongoDB supports several authentication methods, including:
● SCRAM-SHA-1 and SCRAM-SHA-256: These are the default mechanisms that use
salted challenge response to authenticat users.
● LDAP (Lightweight Directory Access Protocol): Allows integration with existing
directory services.
● x.509 Certificates: Used for client certificate authentication.
db.createUser({
user: "myUser",
pwd: "myPassword",
roles: [{ role: "readWrite", db: "myDatabase" }]
});
User Authorization:
Authorization determines what actions an authenticated user is allowed to perform. It
controls permissions and assigns roles to users, specifying what resources (databases,
collections) they can access and what operations (read, write, administration tasks) they
can perform.

After authentication, authorization determines what resources a user can access.

MongoDB uses roles to manage permissions effectively.

Roles can be predefined (like read, readWrite, dbAdmin) or custom-defined.

To assign a role to a user, you can use:

db.grantRolesToUser("myUser", [{ role: "dbAdmin", db: "myDatabase" }]);


Key Aspects of Authorization in MongoDB:

● Role-Based Access Control (RBAC): MongoDB uses RBAC to manage permissions. Each user
is assigned one or more roles that define what actions they can perform within the database. Roles
can be predefined by MongoDB or custom-defined by administrators.
● Predefined Roles: MongoDB provides several built-in roles with specific permissions, such as:
○ Database User Roles:
■ read: Provides read-only access to a specific database.
■ readWrite: Allows reading and writing to a specific database.
○ Database Administration Roles:
■ dbAdmin: Grants the ability to manage database indexes, view statistics, and perform
administrative tasks.
○ Cluster Administration Roles:
■ clusterAdmin: Allows performing administrative tasks across the entire cluster,
including managing databases, collections, and replication.
Data encryption at rest and in transit
In the context of MongoDB, data encryption can be applied in two main states: at rest and in transit. These
measures ensure the confidentiality and security of sensitive data, protecting it from unauthorized access or
interception.

Data Encryption at Rest: This involves encrypting data stored on disk to protect it from unauthorized
access.
MongoDB provides built-in encryption for data at rest using the WiredTiger storage engine.
To enable encryption at rest, you can configure your MongoDB instance with the following options:
mongod --enableEncryption --encryptionKeyFile /path/to/keyfile
For example, enabling it might look like this in your configuration file:
storage:
engine: encrypted
encryption:
keyFile: /path/to/keyfile
Key Aspects of Encryption at Rest in MongoDB:

● Encrypted Storage Engine: MongoDB's Enterprise version supports


encryption at rest through the WiredTiger storage engine. It uses
industry-standard encryption algorithms, like AES-256 (Advanced Encryption
Standard with 256-bit key length).
● Encryption Keys: Encryption at rest involves encrypting data using encryption
keys. In MongoDB, keys are managed by a Key Management System (KMS),
such as AWS KMS, Azure Key Vault. These keys can be rotated periodically to
enhance security.
● Per-Collection Encryption: MongoDB allows encryption to be applied at the
collection level, meaning different collections can have different encryption
configurations.
● Automatic Decryption: Data is automatically decrypted when accessed by
authorized applications or users, without requiring manual intervention
Data Encryption in Transit: Encryption in transit refers to encrypting data while it is being
transmitted between clients, applications, and MongoDB servers, as well as between
servers in a MongoDB cluster.
This ensures that data is protected from interception during transmission.

MongoDB supports TLS/SSL to encrypt network traffic.


mongod --tlsMode requireTLS --tlsCertificateKeyFile /path/to/certificate.pem
Key Aspects of Encryption in Transit in MongoDB:

● TLS/SSL: MongoDB supports TLS (Transport Layer Security) and SSL


(Secure Sockets Layer) protocols to encrypt network traffic between clients and
the database server. This ensures that data sent between clients and servers
cannot be intercepted or read by third parties.
● Server and Client Authentication: MongoDB supports mutual authentication
using X.509 certificates. This ensures that both the server and the client are
authenticated before any data is exchanged.
● Internal Node Communication: In a MongoDB replica set or sharded cluster,
data transmitted between MongoDB nodes can also be encrypted to secure
inter-node communication.
● Configuration: You can enable TLS/SSL in MongoDB by configuring the
net.ssl.mode setting and providing the necessary certificates for the server
and clients
Best Practices for Secure MongoDB Deployments
1. Use Strong Authentication Mechanisms: Always enable authentication and use strong
passwords or certificates.
2. Limit User Privileges: Assign the least privilege necessary for users by using roles effectively.
3. Enable Encryption:
● Use encryption at rest to protect stored data.
● Use TLS/SSL for encrypting data in transit.
4. Regularly Update Software: Keep MongoDB and its dependencies up-to-date to mitigate
vulnerabilities.
5. Network Security:
● Use firewalls to restrict access to your MongoDB instances.
● Avoid exposing your database directly to the internet; use VPNs or SSH tunnels instead.
6. Monitor and Audit Access: Implement logging and monitoring to detect unauthorized access
attempts or anomalies in database usage.
7. Backup Data Securely: Ensure that backups are encrypted and stored securely, following best
practices for disaster recovery.
Scalability and High Availability
MongoDB is designed to handle large volumes of data and high traffic loads, making it suitable for
applications that require scalability and high availability.

Below are key features that support these requirements:

sharding,

replication,

MongoDB Atlas.
Sharding for Horizontal Scaling
Sharding is a method used to distribute data across multiple servers or clusters, allowing for horizontal
scaling.

This approach enables MongoDB to handle large datasets and high throughput operations by splitting data
into smaller, more manageable pieces called shards.

Example of Sharding
To implement sharding in MongoDB, you first need to enable sharding on your database:

sh.enableSharding("myDatabase");

Next, you can shard a collection by specifying the shard key:

sh.shardCollection("myDatabase.myCollection", { "userId": 1 });

In this example, the collection myCollection is sharded based on the userId field. This allows
MongoDB to distribute the data across multiple shards based on the values of userId , enhancing
performance and scalability
Replication for Data Redundancy and Failover
Replication in MongoDB provides data redundancy and high availability by maintaining
multiple copies of data across different servers.
A replica set consists of a primary node (which receives all write operations) and one or more
secondary nodes (which replicate the primary's data).

Example of Replication
To set up a replica set, you start your MongoDB instances with the --replSet
mongod --replSet "myReplicaSet" --port 27017 --dbpath /data/db1
mongod --replSet "myReplicaSet" --port 27018 --dbpath /data/db2
After starting the instances, you can initiate the replica set:
rs.initiate();
Once configured, if the primary node fails, one of the secondary nodes can be automatically
elected as the new primary, ensuring continuous availability of your application.
MongoDB Atlas
MongoDB Atlas is a cloud database service that helps developers build and manage databases:

Features
MongoDB Atlas offers a suite of data services, including built-in automation, performance optimization tools,
and security. It's also open-source and uses JSON documents.
MongoDB Atlas for Managed Database Services
MongoDB Atlas is a fully managed cloud database service that simplifies deploying and managing
MongoDB databases across various cloud providers like AWS, Google Cloud, and Azure.

Atlas provides built-in features for scalability, security, and automation.

Key Features of MongoDB Atlas


1. Multi-Cloud Deployment: Atlas allows you to deploy databases across multiple cloud providers,
enhancing flexibility and resilience.
2. Automated Scaling: Atlas can automatically scale resources based on your application's needs
without downtime.
3. Built-in Security: With features like end-to-end encryption, IP whitelisting, and role-based access
control, Atlas ensures that your data is secure.
4. Monitoring and Alerts: Atlas provides real-time monitoring tools and customizable alerts to help
you manage performance effectively.
5. Global Clusters: You can create global clusters that replicate data across different regions to
minimize latency for users worldwide.
Write a NodeJs program that saves a document in Mongodb using mongoose.
// Import Mongoose
/

const mongoose = require('mongoose');

// Connect to MongoDB (Replace 'your_database' with your DB name)


mongoose.connect('mongodb://127.0.0.1:27017/your_database', {
useNewUrlParser: true,
useUnifiedTopology: true
}).then(() => {
console.log('Connected to MongoDB');
}).catch((err) => {
console.error('Error connecting to MongoDB:', err);
});

// Define a Schema
const userSchema = new mongoose.Schema({
name: String,
email: String,
age: Number
});
// Create a Model based on the schema
const User = mongoose.model('User', userSchema);

// Create a new document (user)


const newUser = new User({
name: 'John Doe',
email: '[email protected]',
age: 30
});

// Save the document to MongoDB


newUser.save()
.then(() => {
console.log('Document saved successfully');
mongoose.connection.close(); // Close the connection after saving
})
.catch((err) => {
console.error('Error saving document:', err);
});

You might also like