Unit 5
Unit 5
Greater Noida
Unit: 5
Introduction To NOSQL with Cloud Database
Recap:
• Discussion about Cloud and Database Management System.
Objective:
In this topic we focus on There are several advantages of working
with NoSQL databases such as MongoDB and Cassandra. The main
advantages are high scalability and high availability. High scalability:
NoSQL database such as MongoDB uses sharding for horizontal
scaling.
Recap:
Revision of Database Management Systems.
NoSQL databases (aka "not only SQL") are non-tabular databases and
store data differently than relational tables. NoSQL databases come in a
variety of types based on their data model. The main types are
document, key-value, wide-column, and graph. They provide flexible
schemas and scale easily with large amounts of data and high user
loads.
When people use the term “NoSQL database,” they typically use it to
refer to any non-relational database. Some say the term “NoSQL” stands
for “non SQL” while others say it stands for “not only SQL.” Either way,
most agree that NoSQL databases are databases that store data in a
format other than relational tables.
Introduction NoSQL
1. Stands for Not Only SQL.
2. The idea of NoSQL founded in 1998 with term lightweight
Schema Less by Carlo Strozzi.
3. Open-source database.
4. NoSQL will be the future database.
5. Very compatible with distributed systems.
6. Lower cost.
7. High performance database.
8. Founded to handle huge data space.
9. Used by Facebook , Google , Wikipedia …
Document Store.
JSON,XML … document structured.
No Join.(handle it in your code).
Column Database.
Each storage block contains data from only one column.
Reduce access and scanning time.
Still use tables without joins statements.
Better for data analytics.
Key-Value, Document,
Data Model Table-based
Column, Graph
• Flexible schemas
• Horizontal scaling
• Fast queries due to the data model
• Ease of use for developers
1. Flexible Schemas
•What It Means: Unlike traditional relational databases that require predefined
schemas (structured tables with fixed columns), NoSQL databases support
dynamic schemas. You can store data without defining a strict structure
beforehand.
•Benefits:
• Accommodates changes easily when the application evolves.
• Useful for applications with variable or hierarchical data (e.g., JSON,
XML).
• Reduces downtime when the schema needs to be updated.
•Use Case: In e-commerce, product attributes (like size, color, weight) can vary
significantly. NoSQL databases allow for flexible attributes for each product.
2. Horizontal Scaling
•What It Means: Horizontal scaling involves adding more servers (nodes) to
distribute the database workload, as opposed to vertical scaling, which
upgrades a single server's hardware.
•Benefits:
• Scalability to handle massive amounts of data and traffic.
• Cost-efficient because adding more servers is often cheaper than
upgrading a single powerful server.
• Enables high availability and fault tolerance through replication across
nodes.
•Use Case: Social media platforms like Facebook or Instagram use horizontal
scaling to handle billions of users and their interactions in real time.
Uses of NoSqL
NoSQL databases are widely used across various industries for applications requiring
scalability, flexibility, and high performance. Here are the primary uses of NoSQL
databases:
1. Big Data Applications
•Why: NoSQL databases are designed to handle massive volumes of unstructured or semi-
structured data generated at high velocity.
•Examples:
• Storing logs, metrics, and event data for monitoring and analytics.
• Managing clickstream data in web and mobile applications.
2. Real-Time Applications
•Why: NoSQL databases provide low-latency read and write operations, essential for
real-time interactions.
•Examples:
• Chat applications (e.g., WhatsApp or Slack).
• Online multiplayer games that require real-time updates.
• Financial systems for processing live transactions.
9. Gaming
•Why: Gaming platforms need real-time data access for player stats, leaderboards, and
multiplayer synchronization.
•Examples:
• Managing game state and progress in online multiplayer games.
• Storing in-game purchases and virtual assets.
ACID
BASE
1. Basically Available.
3. Eventually Consistent.
4. Weak consistency.
5. Availability first.
Data arrives from one or few locations. Data arrives from many locations.
Uses of MongoDB
MongoDB is a document-oriented NoSQL database, designed for flexibility,
scalability, and high performance. It is widely used in applications requiring rapid
development and handling of large-scale unstructured or semi-structured data.
Uses of Cassandra
Apache Cassandra is a distributed NoSQL database designed for high
availability, fault tolerance, and scalability. It follows a peer-to-peer
architecture and is optimized for handling large-scale, high-velocity
data.
Common Use Cases:
✅ Real-Time Big Data Applications – Logs, metrics, sensor data, IoT
devices.
✅ High-Throughput Transactional Applications – Banking, fraud
detection, stock trading.
✅ Social Media and Messaging Apps – Facebook, Twitter, Discord,
WhatsApp.
Uses of HBase
HBase is a distributed, column-family NoSQL database built on top of Hadoop and
HDFS. It is designed for real-time read/write access to large datasets and supports
horizontal scaling. Unlike traditional relational databases, HBase is optimized for
sparse, unstructured, and semi-structured data.
Common Use Cases:
✅ Big Data Analytics – Large-scale data processing using Hadoop.
✅ Time-Series Data Storage – Sensor data, event logs, IoT data streams.
✅ Real-Time Data Processing – Clickstream analysis, fraud detection.
✅ Search Engines & Indexing – Scalable indexing for text search engines.
✅ Data Warehousing – High-volume, structured storage with quick retrieval.
✅ Recommendation Systems – AI-driven personalization for e-commerce,
streaming services.
✅ Financial Services – Stock market analytics, risk analysis.
✅ Government & Research – Genomic data processing, satellite image storage.
Uses of Neo4j
Neo4j is a graph database designed for highly connected data and complex relationships.
Unlike relational and NoSQL databases, it stores and queries data as nodes, edges, and
properties, making it ideal for graph traversal and relationship-heavy applications.
Common Use Cases:
✅ Social Networks – Analyzing user connections (e.g., LinkedIn, Facebook).
✅ Fraud Detection – Identifying suspicious transactions using pattern recognition.
✅ Recommendation Engines – AI-driven suggestions for e-commerce, streaming, and news.
✅ Knowledge Graphs – Building semantic web applications (e.g., Google’s Knowledge
Graph).
✅ Network & IT Infrastructure Management – Analyzing dependencies in telecom and
cloud networks.
✅ Supply Chain Optimization – Managing logistics and real-time inventory tracking.
✅ Cybersecurity – Threat analysis, attack path visualization.
✅ Healthcare & Genomics – Mapping disease relationships and DNA sequencing .
Uses of Riak
Riak is a distributed NoSQL key-value store designed for high availability, fault
tolerance, and scalability. It is built on eventual consistency and follows an AP
(Availability & Partition Tolerance) model of the CAP theorem. Riak is optimized
for large-scale, decentralized applications and workloads requiring low-latency,
high-throughput storage.
Common Use Cases:
✅ Distributed Storage – Large-scale, highly available storage solutions.
✅ Internet of Things (IoT) – Handling massive amounts of sensor data.
✅ Session Management – Storing user sessions for high-traffic applications.
✅ E-commerce & Retail – Product catalogs, recommendation engines, and
transaction logs.
✅ Messaging & Chat Applications – Fast, scalable message storage.
✅ Financial Services & Banking – Fraud detection, real-time analytics.
✅ Log Management & Analytics – Storing large-scale system logs.
✅ Content Delivery Networks (CDN) – Caching and distributing content globally.
Objective:
In this topic we focus on MongoDB which is a source-available cross-
platform document-oriented database program. Classified as a
NoSQL database program, MongoDB uses JSON-like documents with
optional schemas. MongoDB is developed by MongoDB Inc. and
licensed under the Server Side Public License.
Recap:
Database Database
Table Collection
Tuple/Row Document
column Field
Mysqld/Oracle mongod
mysql/sqlplus mongo
02/25/2025 Dr. Nidhi Sharma UNIT 05 61
Introduction MongoDB
MongoDB: Goal
• Goal: bridge the gap between key-value stores (which are fast and scalable) and
relational databases (which have rich functionality).
Is It Fast?
NoSQL: Categories
Objective:
In this topic we focus on introducing the essential ways of
interacting with NoSQL data stores. The types of NoSQL stores vary
and so do the ways of accessing and interacting with them. This
topic attempts to summarize a few of the most prominent of these
disparate ways of accessing and querying data in NoSQL databases.
Recap:
Objective:
In this topic we focus on MongoDB - Datatypes
1. String − This is the most commonly used datatype to store the data.
Recap:
Revision of Nosql Databases.
• Field Value
– Scalar (Int, Boolean, String,
One document
Date, …)
– Document (Embedding or
Nesting)
Remember it is stored in
binary formats (BSON)
• Or:
• 1st 4 bytes timestamp
• Next 3 bytes machine id
• Next 2 bytes Process id
• Last 3 bytes incremental values
Objective:
In this topic we focus on the NoSQL database approach which is
characterized by a move away from the complexity of SQL based
servers. The logic of validation, access control, mapping querieable
indexed data, correlating related data, conflict resolution,
maintaining integrity constraints, and triggered procedures is moved
out of the database layer.
Recap:
• Advantages:
• Can handle large amounts of data and heavy load,
• Easy retrieval of data by keys.
• Limitations:
02/25/2025 Dr. Nidhi Sharma UNIT 05 82
NoSQL Storage Architecture
• Complex queries may attempt to involve multiple key-value pairs which may delay performance.
• Data can be involving many-to-many relationships which may collide.
• Examples:
• DynamoDB
• Berkeley DB
• Advantages:
• HBase
• Bigtable by Google
• Cassandra
02/25/2025 Dr. Nidhi Sharma UNIT 05 84
NoSQL Storage Architecture
• Advantages:
• Examples:
• Neo4J
• FlockDB( Used by Twitter)
Objective:
In this topic we focus on CRUD Meaning: CRUD is an acronym that
comes from the world of computer programming and refers to the
four functions that are considered necessary to implement a
persistent storage application: create, read, update and delete.
Recap:
Manual: http://docs.mongodb.org/master/MongoDB-manual.pdf
(Focus on Ch. 3, 4 for now)
Dataset: http://docs.mongodb.org/manual/reference/bios-example-collection/
• Create
– db.collection.insert( <document> )
– db.collection.save( <document> )
– db.collection.update( <query>, <update>, { upsert: true } )
• Read
– db.collection.find( <query>, <projection> )
– db.collection.findOne( <query>, <projection> )
• Update
– db.collection.update( <query>, <update>, <options> )
• Delete
– db.collection.remove( <query>, <justOne> )
02/25/2025 Dr. Nidhi Sharma UNIT 05 92
CRUD operations with MongoDB
CRUD Examples
Objective:
In this topic we focus on Most NoSQL and NewSQL data stores
which implement some sort of horizontal partitioning or sharding,
which involves storing sets or rows/records into different segments
(or shards) which may be located on different servers.
Recap:
In RDBMS In MongoDB
Either insert the 1st docuement
• You can put condition on any field in the document (even _id)
Equivalent to in SQL:
Two
operators
Query Condition
New
doc
For the document having item = “BE10”, replace it with the given document
Objective:
In this topic we focus on MongoDB uses multikey indexes to index
the content stored in arrays. When you index on a column that holds
an array value, MongoDB creates separate index entries for every
element of the array. These multikey indexes allow queries to select
documents that contain arrays by matching on element or elements
of the arrays.
Recap:
Revision of DBMS architecture.
• Indexes are special data structures, that store a small portion of the data set in an
easy-to-traverse form. The index stores the value of a specific field or set of fields,
ordered by the value of the field as specified in the index.
• Syntax
• The basic syntax of createIndex() method is as follows().
• >db.COLLECTION_NAME.createIndex({KEY:1})
• Here key is the name of the field on which you want to create index and 1 is for
ascending order. To create index in descending order you need to use -1.
02/25/2025 Dr. Nidhi Sharma UNIT 05 110
Indexing and ordering datasets (MongoDB)
• Example
• >db.mycol.createIndex({"title":1})
• {
• "createdCollectionAutomatically" : false,
• "numIndexesBefore" : 1,
• "numIndexesAfter" : 2,
• "ok" : 1
• }
• >
• In createIndex() method you can pass multiple fields, to create index
on multiple fields.
• >db.mycol.createIndex({"title":1,"description":-1})
• >
02/25/2025 Dr. Nidhi Sharma UNIT 05 111
Indexing and ordering datasets (MongoDB)
• > db.mycol.createIndex({"title":1,"description":-1})
Objective:
In this topic we focus on cloud database which is a database service
built and accessed through a cloud platform. It serves many of the
same functions as a traditional database with the added flexibility of
cloud computing. Users install software on a cloud infrastructure to
implement the database.
Recap:
• Key features:
• Scalability
• Cloud databases can expand their storage capacities on run-time to
accommodate changing needs. Organizations only pay for what they
use.
• Disaster recovery
• In the event of a natural disaster, equipment failure or power outage,
data is kept secure through backups on remote servers.
02/25/2025 Dr. Nidhi Sharma UNIT 05 117
Cloud database: - Introduction of Cloud database
• Considerations for cloud databases
• Control options
• Users can opt for a virtual machine image managed like a traditional database or a provider’s
database as a service (DBaaS).
• Database technology
• SQL databases are difficult to scale but very common. NoSQL databases scale more easily but
do not work with some applications.
• Security
• Most cloud database providers encrypt data and provide other security measures;
organizations should research their options.
• Maintenance
• When using a virtual machine image, one should ensure that IT staffers can maintain the
underlying infrastructure.
02/25/2025 Dr. Nidhi Sharma UNIT 05 118
Cloud database: - Introduction of Cloud database
• Platform as a service(PaaS)
• Software as a service(SaaS)
• Infrastructure as a service(IaaS)
• Platform as a service or PaaS is the most common type here, providing the provision of servers,
data storage, and operating systems. It helps in the storage and acts as a platform for the virtual
database, saving the hardware cost and helping to access the data from all around the world.
• SaaS, on the other hand, provides the entire software as a service to the organization in exchange
for an amount and is an excellent business option for all those organizations involving a lot of web
users.
• CLOUCloud computing is on a rise because of the flexibility and the ease of services
that it provides. Several well-known IT giants are planning to capture the market.
Most of the cloud databases run on the well-known cloud computing platforms like
Rackspace, salesforce, GoGrid, and Amazon EC2.
• Here are the top five most beneficial cloud services for data storage.
Objective:
In this topic we focus on NoSQL databases are specifically designed
for low cost commodity hardware. These databases are mostly used
for storage and access of data across multiple storage cluster. For
example Google, Facebook, Google+, Google big table, Amazon
Dynamo, Twitter etc. collects and stores Terabytes of data for their
user every day.
Recap:
Most of the time (and for most of the remainder of this page), the
term “cloud database” refers to a cloud-based database-as-a-service.
02/25/2025 Dr. Nidhi Sharma UNIT 05 122
NoSQL with Cloud Database
• Why use a cloud database/DBaaS?
• The key benefits of cloud databases are that they are accessible from anywhere, scalable
from day one, and designed for reliability and performance.
• Common cloud database use cases
• Cloud databases work in most cases that traditional databases do. They are particularly
valuable when building software products that:
• cloud-native [Cloud-native technologies empower organizations to build and run
scalable applications in modern, dynamic environments such as public, private, and
hybrid clouds. ]
• Require large volume of data
• Need to handle high scale traffic
• Are distributed geographically
• Data applications that take advantage of centralization, like analytics,
are also fantastic candidates for cloud database usage.
• While certain use cases are more obvious candidates for cloud
database usage, more traditional use cases, like real-time online
transaction processing, caching, or data warehousing work just as well
in the fully managed paradigm.
02/25/2025 Dr. Nidhi Sharma UNIT 05 123
NoSQL with Cloud Database
• Cloud database considerations
• Whether you’re still thinking about whether a cloud
database is right for you, or in the process of selecting the
ideal database-as-a-service for your needs, there are a few
key factors to take into consideration:
• Database Technology
• Management System
• Cost Model
• Security
02/25/2025 Dr. Nidhi Sharma UNIT 05 124
NoSQL with Cloud Database
• MongoDB Atlas cloud database
MongoDB can be installed and run on any cloud provider or on-premise network as a
self-managed database cluster or virtual machine, or on AWS, or Azure using MongoDB
Atlas, our cloud database-as-a-service (DBaaS) offering. There are major benefits to
adopting the DBaaS option, including:
• Simplified management
• Elastic autoscaling-Atlas uses to automatically scale your cluster tier, storage
capacity, or both in response to cluster usage.
• Redundancy, backup, and restore
• Charts -Atlas Charts offers a quick, simple, and powerful way to visualize your data
from Atlas and Atlas Data Lake. Atlas Data Lake allows you to natively query and
combine data across MongoDB Atlas and AWS S3 without complex integrations.
• Connectors-The MongoDB Connector for Business Intelligence for Atlas (BI
Connector) is only available for M10 and larger clusters. The BI Connector is a
powerful tool which provides users SQL-based access to their MongoDB databases.
As a result, the BI Connector performs operations which may be CPU and memory
intensive.
• Schema navigator-The Schema tab provides an overview of the data type and shape
of the fields in a particular collection.
02/25/2025 Dr. Nidhi Sharma UNIT 05 125
NoSQL with Cloud Database
• The way a cloud database works is that rather than installing, configuring,
and maintaining a database instance or instances, an automated system is
able to provision, manage, and scale the underlying database cluster for you.
Q3: Explain difference between scaling horizontally and vertically for databases
Q7: Does MongoDB support ACID transaction management and locking functionalities?
Q9: How can you achieve primary key - foreign key relationships in MongoDB?
• 3.________ stores are used to store information about networks, such as social connections.
• (a) Key-value
• (b) Wide-column
• (c) Document
• (d) Graph
• 4. NoSQL databases is used mainly for handling large volumes of ______________ data.
• (a) unstructured
• (b) structured
• (c) semi-structured
• (d) all of the mentioned
•
• 5. Which of the following language is MongoDB written in?
• (a) Javascript
• (b) C
• (c) C++
• (d) All of the mentioned
• 8. NoSQL was designed with security in mind, so developers or security teams don't need to worry about
implementing a security layer. Is it true or false?
• (a) True
• (b) False
• 9. Which of the following is not a reason NoSQL has become a popular solution for some organizations?
• (a) Better scalability
• (b) Improved ability to keep data consistent
• (c) Faster access to data than relational database management systems (RDBMS)
• (d) More easily allows for data to be held across multiple servers
http://www.nptelvideos.com/lecture.php?id=6516
http://www.nptelvideos.com/lecture.php?id=6517
http://www.nptelvideos.com/lecture.php?id=6518
http://www.nptelvideos.com/lecture.php?id=6519
https://www.youtube.com/watch?v=2yQ9TGFpDuM