0% found this document useful (0 votes)
31 views204 pages

DBMS Unit-V 1

Uploaded by

agrasen09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views204 pages

DBMS Unit-V 1

Uploaded by

agrasen09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 204

Noida Institute of Engineering and Technology, Greater Noida

Introduction To NOSQL with Cloud Database

Unit: 5

DBMS
ACSAI0402 Raj Kumar Gupta
Assistant Professor
CSE(DS)
B-Tech IV Sem

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN


IT 05
1
Brief Introduction of Faculty member

Name: Shabnam Firdaus


Area of Research: Machine Learning, Artificial
Intelligence
Contact Details:
Email: [email protected]

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 2


Evaluation Scheme

• B. Tech (Data Science & Related Branches)


• Fourth Semester
• Professional Core Course

DATABASE MANAGEMENT SYSTEMS

LTP Credits
3–1–0 4

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 3


Evaluation Scheme

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 4


IT 05
Course Contents / Syllabus

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 5


IT 05
Course Contents / Syllabus

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 6


IT 05
Branch wise Applications

• From the 1980s to the Internet era in the late 1990s, SQL databases
dominated the development landscape. Large commercial applications, niche
products, and custom applications of all types were based on SQL.
• But the rise of the Internet has changed application development profoundly.
The amount of data, the structure of the data, the scale of applications, the
way applications have developed have all changed dramatically.
• These changes have led many organizations of all sizes to adopt NoSQL
database technology.
• In recent times you can easily capture and access data from various sources,
like Facebook, Google, etc.
• User’s personal information, geographic location data, user generated
content, social graphs and machine logging data are some of the examples
where data is increasing rapidly.
• To use above mentioned properties, it is necessary to process large volume of
data.
• For which relational databases are not suitable. The evolution of NoSQL
databases is to handle this large volume of data properly.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 7
IT 05
Course Objective

The objective of this course is -


• The objective of the course is to present an introduction to database
management systems, with an emphasis on how to organize, maintain and
retrieve - efficiently, and effectively - information in relational and non-
relation Database.
After successfully completing this course, students will be able to:
• Distinguish the different types of NoSQL databases
• Understand the impact of the cluster on database design
• State the CAP theorem and explain it main points
• Explain where HBase, MongoDB, Cassandra, Neo4j, and Redis fit with the
CAP theorem
• Work with the Hadoop Distributed File System (HDFS) as a foundation for
NoSQL technologies
• Describe the design of HBase, MongoDB, Cassandra, Neo4j, and Redis
• Use the data control, definition, and manipulation languages of the NoSQL
databases covered in the course
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 8
Course Outcomes

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 9


Program Outcomes (POs)

Engineering Graduates will be able to:


1. Engineering knowledge: Apply the knowledge of mathematics, science,
engineering fundamentals, and an engineering specialization to the solution of
complex engineering problems.

2. Problem analysis: Identify, formulate, review research literature, and analyze


complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.

3. Design/development of solutions: Design solutions for complex engineering


problems and design system components or processes that meet the specified
needs with appropriate consideration for the public health and safety, and the
cultural, societal, and environmental considerations.

4. Conduct investigations of complex problems: Use research-based knowledge


and research methods including design of experiments, analysis and interpretation
of data, and synthesis of the information to provide valid conclusions.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 10


Program Outcomes (POs)

Contd..
5. Modern tool usage: Create, select, and apply appropriate techniques,
resources, and modern engineering and IT tools including prediction and
modeling to complex engineering activities with an understanding of the
limitations.

6. The engineer and society: Apply reasoning informed by the contextual


knowledge to assess societal, health, safety, legal and cultural issues and
the consequent responsibilities relevant to the professional engineering
practice.

7. Environment and sustainability: Understand the impact of the


professional engineering solutions in societal and environmental contexts,
and demonstrate the knowledge of, and need for sustainable development.

8. Ethics: Apply ethical principles and commit to professional ethics and


responsibilities and norms of the engineering practice.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 11
Program Outcomes (POs)

Contd..
9. Individual and team work: Function effectively as an individual, and as a
member or leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities
with the engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make
effective presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and
understanding of the engineering and management principles and apply these
to one’s own work, as a member and leader in a team, to manage projects and
in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and
ability to engage in independent and life-long learning in the broadest context of
technological change.

Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT


12/27/2024 12
05
COs and POs Mapping

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

ASCAI0402.1
2 2 3 3 3 2 3 2 2 2 2 3

ACSAI0402.2
3 3 3 2 2 2 2 2 2 2 2 3

ACSAI0402.3
2 3 3 3 3 2 2 2 2 2 2 2

ACSAI0402.4
2 3 2 2 2 2 2 2 2 3 2 2

ACSAI0402.5
2 3 2 2 2 3 2 2 3 2 2 2

AVG
2.20 2.80 2.60 2.40 2.40 2.20 2.20 2.00 2.20 2.20 2.00 2.40

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 13


05
Program Specific Outcomes (PSOs)

On successful completion of graduation degree the Engineering graduates will


be able to:

PSO1: The ability to identify, analyze real world problems and design their ethical
solutions using artificial intelligence, robotics, virtual/augmented reality, data
analytics, block chain technology, and cloud computing.

PSO2:The ability to design and develop the hardware sensor devices and related
interfacing software systems for solving complex engineering problems.

PSO3:The ability to understand inter disciplinary computing techniques and to


apply them in the design of advanced computing.

PSO4: The ability to conduct investigation of complex problem with the help of
technical, managerial, leadership qualities, and moder engineering tools provided
by industry sponsored laboratories.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 14


COs and PSOs Mapping

Program Specific Outcomes

PSO1 PSO2 PSO3 PSO4

ACSAI0402.1
3 1 3 1

ACSAI0402.2
3 1 3 1

ACSAI0402.3
3 1 3 1

ACSAI0402.4
3 1 3 1

ACSAI0402.5
3 1 3 1

AVG
3.00 1.00 3.00 1.00

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 15


05
Program Educational Objectives (PEOs)

PEO1: To have an excellent scientific and engineering breadth so as to comprehend,


analyze, design and provide sustainable solutions for real-life problems using state-of-
the-art technologies.

PEO2:To have a successful career in industries, to pursue higher studies or to support


entrepreneurial endeavors and to face global challenges.

PEO3:To have an effective communication skills, professional attitude, ethical values


and a desire to learn specific knowledge in emerging trends, technologies for
research, innovation and product development and contribution to society.

PEO4: To have life-long learning for up-skilling and re-skilling for successful
professional career as engineer, scientist, entrepreneur and bureaucrat for
betterment of society

Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT


12/27/2024 16
05
Result Analysis

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT 05 17


Question Paper Template

Mr. Rajkumar Gupta ACSAI0402 DBMS UNIT


12/27/2024 18
05
Question Paper Template
SECTION – A CO

1. Attempt all parts- [10×1=10]

1-a. Question- (1)


1-b. Question- (1)
1-c. Question- (1)
1-d. Question- (1)
1-e. Question- (1)
1-f. Question- (1)
1-g. Question- (1)
1-h. Question- (1)
1-i. Question- (1)
1-j. Question- (1)

2. Attempt all parts- [5×2=10] CO

2-a. Question- (2)


2-b. Question- (2)
2-c. Question- (2)
2-d. Question- (2)
2-e. Question- (2)

SECTION – B CO

3. Answer any five of the following- [5×6=30]


3-a. Question- (6)
3-b. Question- (6)
3-c. Question- (6)
3-d. Question- (6)
3-e. Question- (6)
3-f. Question- (6)
3-g. Question- (6)
SECTION – C CO

4 Answer any one of the following- [5×10=50]

4-a. Question- (10)

4-b. Question- (10)


5. Answer any one of the following-
5-a. Question- (10)

5-b. Question- (10)


6. Answer any one of the following-
6-a. Question- (10)

6-b. Question- (10)


7. Answer any one of the following-
7-a. Question- (10)

7-b. Question- (10)

8. Answer any one of the following-


8-a. Question- (10)

8-b. Question- (10)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 19


IT 05
Prerequisite and Recap

Prerequisites:
• Linux/ Windows operating system.
• Database Management Software's such as Oracle
• Knowledge on SQL/PLSQL.
• Cloud Infrastructure.
• Handling Big Data on Unstructured Databases.
• Programming Languages (Python or Java)
• Recap:
• Discussion about Cloud and Database Management System.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 20


IT 05
Brief Intro of Subject with video
Digital transformation is the name for the trend toward serving customers using scalable,
customizable, Internet and mobile applications. These applications are often hard to build and
evolve rapidly using SQL technology. For this reason, from the mid-2000s to 2020 we have seen a
steady rise in the adoption of NoSQL database technology.

The rise of NoSQL is an important event in computer science and in application development
because SQL has been so dominant for so long. Many other forms of database technology have
come and gone, but few have had the wide adoption of NoSQL.

By understanding the rise in popularity of NoSQL databases, we should be able to shed light on
when it makes sense to use NoSQL.
NoSQL covers a lot of different database structures and data models.

• http://www.nptelvideos.com/lecture.php?id=6516
• http://www.nptelvideos.com/lecture.php?id=6517
• http://www.nptelvideos.com/lecture.php?id=6518
• http://www.nptelvideos.com/lecture.php?id=6519
• https://www.youtube.com/watch?v=2yQ9TGFpDuM
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 21
IT 05
Content – Unit 5

• Definition of NoSQL, History of NoSQL and Different NoSQL


products, Exploring Mongo DB, Interfacing and Interacting with
NoSQL, NoSQL Storage Architecture, CRUD operations with
MongoDB, Querying, Modifying and Managing NoSQL Data stores,
Indexing and ordering datasets(MongoDB).

• Cloud database: - Introduction of Cloud database, NoSQL with


Cloud Database, Introduction to Real time Database.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 22


IT 05
Unit Objective

The objective of the Unit 1 is :

1.To provide an overview of an exciting growing field of NOSQL databases.

2. To inculcate the preliminary knowledge of domain of DBMS and elaborate


when should NoSQL be used:
• When huge amount of data need to be stored and retrieved .
• The relationship between the data you store is not that important
• The data changing over time and is not structured.
• Support of Constraints and Joins is not required at database level
• The data is growing continuously, and you need to scale the database
regular to handle the data.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 23


IT 05
Topic mapping with CO

Topic CO
Definition of NoSQL CO5

History of NoSQL and Different CO5


NoSQL products
Exploring Mongo DB CO5
Interfacing and CO5
Interacting with NoSQL
NoSQL Storage Architecture CO5
CRUD operations with MongoDB CO5
Querying CO5
Modifying CO5
and Managing NoSQL Data stores
Indexing and ordering CO5
datasets(MongoDB)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 24


IT 05
Topic mapping with CO

Topic CO
Introduction of Cloud database CO5
NoSQL with Cloud Database CO5
Introduction to Real time CO5
Database

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 25
UNIT 05
• Lecture 1

• Introduction NoSQL

• Characteristic of NoSQL

• History of NoSQL and Different NoSQL products

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 26
UNIT 05
Definition of NoSQL

Objective:
 In this topic we focus on There are several advantages of working
with NoSQL databases such as MongoDB and Cassandra. The main
advantages are high scalability and high availability. High scalability:
NoSQL database such as MongoDB uses sharding for horizontal
scaling.

Recap:
 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 27


IT 05
Definition of NoSQL

NoSQL databases (aka "not only SQL") are non-tabular databases and store data differently
than relational tables. NoSQL databases come in a variety of types based on their data
model. The main types are document, key-value, wide-column, and graph. They provide
flexible schemas and scale easily with large amounts of data and high user loads.

When people use the term “NoSQL database,” they typically use it to refer to any non-
relational database. Some say the term “NoSQL” stands for “non SQL” while others say it
stands for “not only SQL.” Either way, most agree that NoSQL databases are databases that
store data in a format other than relational tables.

A NoSQL (originally referring to "non-SQL" or "non-relational") database provides a


mechanism for storage and retrieval of data that is modeled in means other than the
tabular relations used in relational databases. Such databases have existed since the late
1960s, but the name "NoSQL" was only coined in the early 21st century, triggered by the
needs of Web 2.0 companies NoSQL databases are increasingly used in big data and real-
time web applications. NoSQL systems are also sometimes called Not only SQL to
emphasize that they may support SQL-like query languages or sit alongside SQL databases
in polyglot-persistent architectures

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 28


IT 05
Definition of NoSQL

Introduction NoSQL
1. Stands for Not Only SQL.
2. The idea of NoSQL founded in 1998 with term lightweight Schema Less
by Carlo Strozzi.
3. Open-source database.
4. NoSQL will be the future database.
5. Very compatible with distributed systems.
6. Lower cost.
7. High performance database.
8. Founded to handle huge data space.
9. Used by Facebook , Google , Wikipedia …

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 29


IT 05
Definition of NoSQL
Increasing Web Applicati on Data.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 30


IT 05
Definition of NoSQL
Characteristi c of NoSQL
1. Large data volumes.
2. Scalable replication and distribution (Horizontal scaling).
3. Queries need to return answers quickly.
4. Asynchronous Inserts & Updates.
5. Schema-less.
6. Designed to support Caching without 3-rd party tools.
7. ACID transaction properties are not needed.
8. BASE here.
9. CAP Theorem.
10. No Joins statement.
11. No complicated Relationships
12. Less administration time(less cost).

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 31


IT 05
Definition of NoSQL
ACID

1. Atomic – All transaction completes (commit) or none of it completes.


2. Consistent – Consistency is defined in terms of constraints.
3. Isolated – The transaction will behave as if it is the only operation
being performed upon the database
4. Durable – Upon completion of the transaction, the operation will not
be reversed.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 32


IT 05
Definition of NoSQL
BASE

1. Basically Available.

2. Soft state(expiration of information).

3. Eventually Consistent.

4. Weak consistency.

5. Availability first.

6. Simpler and faster.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 33


IT 05
Definition of NoSQL
CAP Theory

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 34


IT 05
Definition of NoSQL
Major NoSQL Types
 Key-Value Store.
 Hashing.
 Basic get/put/delete.
 Crazy fast because there is key to get set of values.
 Document Store.
 JSON,XML … document structured.
 No Join.(handle it in your code).
 Column Database.
 Each storage block contains data from only one column.
 Reduce access and scanning time.
 Still use tables without joins statements.
 Better for data analytics.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 35


IT 05
Definition of NoSQL
Developers Viewpoint !
 SQL is better ?
 Natural reaction.
 Everyone's experience.
 Fear of change.
 NoSQL Will :
 Simplify your data model.
 Easy to install.
 your bugs will be fewer and easier to find.
 Lower administration / less DBAs.
 performance is going to be awesome.
 Scale will be much simpler.
 Rapid Development.
 Large binary objects.
 Graphs/relationships.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 36


IT 05
Definition of NoSQL
Look at Queries sy ntax
 db.inventory.find( { type: "snacks" } );
 db.inventory.find( { type: 'food', price: { $lt: 9.95 } } );
 db.inventory.find( { producer: { company: 'ABC123',
address: '123 Street' } } );
 db.inventory.find( { memos: { $elemMatch: { memo : 'on time', by:
'shipping' } } } );
 db.inventory.insert( { _id: 10, type: "misc", item: "card", qty:
15 } );
 db.users.remove( { status: "D" } )
 db.users.insert( { name: "sue", age: 26, status: "A" } )
 db.users.update( { age: { $gt: 18 } }, { $set: { status: "A" }
}, { multi: true } )
 db.inventory.save( { type: "book", item: "notebook", qty:
40 } )

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 37


IT 05
Definition of NoSQL

What's about functions ?


var myCursor = db.inventory.find( { type: 'food' } );
myCursor.forEach(printjson);

var myCursor = db.inventory.find( { type: 'food' } );


var documentArray = myCursor.toArray();
var myDocument = documentArray[3];

Query Analysis
db.inventory.find( { type: 'food' } ).explain() Output Here

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 38


IT 05
Definition of NoSQL
S u m m a ry

NoSQL :
 Handle huge data.
 High availability with small cost.
 More data redundancy.
 High performance. Pick the right tool for your job
 Less administration time. !
 Less standards. SQL :
• Good to solve ACID problems.
• Expensive.
• Less data redundancy.
• Increasing availability mean increasing cost.
• More standards.
• More administration.

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 39
UNIT 05
History of NoSQL and Different NoSQL products

Objective:
 In this topic we focus on Motivations for this approach include
simplicity of design, simpler "horizontal" scaling to clusters of
machines (which is a problem for relational databases), finer control
over availability and limiting the object-relational impedance
mismatch.
 Recap:

 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 40


IT 05
History of NoSQL and Different NoSQL products

Brief history of NoSQL databases


NoSQL databases emerged in the late 2000s as the cost of storage dramatically
decreased. Gone were the days of needing to create a complex, difficult-to-
manage data model in order to avoid data duplication. Developers (rather than
storage) were becoming the primary cost of software development, so NoSQL
databases optimized for developer productivity. Cost Per MB of Data
Over Time (Log Scale)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 41


IT 05
History of NoSQL and Different NoSQL products

As storage costs rapidly decreased, the amount of data that applications needed
to store and query increased. This data came in all shapes and sizes — structured,
semi-structured, and polymorphic — and defining the schema in advance became
nearly impossible. NoSQL databases allow developers to store huge amounts of
unstructured data, giving them a lot of flexibility.

Additionally, the Agile Manifesto was rising in popularity, and software engineers
were rethinking the way they developed software. They were recognizing the
need to rapidly adapt to changing requirements. They needed the ability to iterate
quickly and make changes throughout their software stack — all the way down to
the database. NoSQL databases gave them this flexibility.

Cloud computing also rose in popularity, and developers began using public clouds
to host their applications and data. They wanted the ability to distribute data
across multiple servers and regions to make their applications resilient, to scale
out instead of scale up, and to intelligently geo-place their data. Some NoSQL
databases like MongoDB provide these capabilities.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 42
IT 05
History of NoSQL and Different NoSQL products

NoSQL database features


Each NoSQL database has its own unique features. At a high level, many NoSQL
databases have the following features:
• Flexible schemas
• Horizontal scaling
• Fast queries due to the data model
• Ease of use for developers

Benefits of NoSQL Databases


NoSQL databases offer many benefits over relational databases. NoSQL
databases have flexible data models, scale horizontally, have incredibly fast
queries, and are easy for developers to work with.
1. Flexible data models

NoSQL databases typically have very flexible schemas. A flexible schema allows
you to easily make changes to your database as requirements change. You can
iterate quickly and continuously integrate new application features to provide
value to your users faster.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 43
IT 05
History of NoSQL and Different NoSQL products

2. Horizontal scaling
Most SQL databases require you to scale-up vertically (migrate to a larger, more expensive
server) when you exceed the capacity requirements of your current server. Conversely,
most NoSQL databases allow you to scale-out horizontally, meaning you can add cheaper,
commodity servers whenever you need to.

3. Fast queries
Queries in NoSQL databases can be faster than SQL databases. Why? Data in SQL databases
is typically normalized, so queries for a single object or entity require you to join data from
multiple tables. As your tables grow in size, the joins can become expensive. However, data
in NoSQL databases is typically stored in a way that is optimized for queries. The rule of
thumb when you use MongoDB is Data that is accessed together should be stored together.
Queries typically do not require joins, so the queries are very fast.

4. Easy for developers


Some NoSQL databases like MongoDB map their data structures to those of popular
programming languages. This mapping allows developers to store their data in the same
way that they use it in their application code. While it may seem like a trivial advantage, this
mapping can allow developers to write less code, leading to faster development time and
fewer bugs.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 44
IT 05
History of NoSQL and Different NoSQL products
Drawbacks of NoSQL Databases
One of the most frequently cited drawbacks of NoSQL databases is that they don’t support
ACID (atomicity, consistency, isolation, durability) transactions across multiple documents.
With appropriate schema design, single record atomicity is acceptable for lots of
applications. However, there are still many applications that require ACID across multiple
records.

To address these use cases MongoDB added support for multi-document ACID transactions
in the 4.0 release, and extended them in 4.2 to span sharded clusters.

Since data models in NoSQL databases are typically optimized for queries and not for
reducing data duplication, NoSQL databases can be larger than SQL databases. Storage is
currently so cheap that most consider this a minor drawback, and some NoSQL databases
also support compression to reduce the storage footprint.

Depending on the NoSQL database type you select, you may not be able to achieve all of
your use cases in a single database. For example, graph databases are excellent for analyzing
relationships in your data but may not provide what you need for everyday retrieval of the
data such as range queries. When selecting a NoSQL database, consider what your use cases
will be and if a general purpose database like MongoDB would be a better option.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 45
IT 05
History of NoSQL and Different NoSQL products

Types of NoSQL databases


Over time, four major types of NoSQL databases emerged: document databases, key-value
databases, wide-column stores, and graph databases.

• Document databases
• Key-value stores
• Column-oriented databases
• Graph databases

1. Document databases store data in documents similar to JSON (JavaScript Object


Notation) objects. Each document contains pairs of fields and values. The values can
typically be a variety of types including things like strings, numbers, booleans, arrays, or
objects.
2. Key-value databases are a simpler type of database where each item contains keys
and values.
3. Wide-column stores store data in tables, rows, and dynamic columns.
4. Graph databases store data in nodes and edges. Nodes typically store information
about people, places, and things, while edges store information about the relationships
between the nodes.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 46
IT 05
History of NoSQL and Different NoSQL products

Document Databases
A document database stores data in JSON, BSON , or XML documents (not Word documents
or Google docs, of course). In a document database, documents can be nested. Particular
elements can be indexed for faster querying.

Documents can be stored and retrieved in a form that is much closer to the data objects
used in applications, which means less translation is required to use the data in an
application. SQL data must often be assembled and disassembled when moving back and
forth between applications and storage.

Document databases are popular with developers because they have the flexibility to
rework their document structures as needed to suit their application, shaping their data
structures as their application requirements change over time. This flexibility speeds
development because in effect data becomes like code and is under the control of
developers. In SQL databases, intervention by database administrators may be required to
change the structure of a database.

The most widely adopted document databases are usually implemented with a scale-out
architecture, providing a clear path to scalability of both data volumes and traffic.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 47


IT 05
History of NoSQL and Different NoSQL products

Use cases include ecommerce platforms, trading platforms, and mobile app
development across industries.

Comparing MongoDB vs PostgreSQL offers a detailed analysis of MongoDB, the


leading NoSQL database, and PostgreSQL, one of the most popular SQL
databases.

Key-Value Stores
The simplest type of NoSQL database is a key-value store . Every data element in
the database is stored as a key value pair consisting of an attribute name (or
"key") and a value. In a sense, a key-value store is like a relational database with
only two columns: the key or attribute name (such as state) and the value (such
as Alaska).

Use cases include shopping carts, user preferences, and user profiles.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 48


IT 05
History of NoSQL and Different NoSQL products

Column-Oriented Databases
While a relational database stores data in rows and reads data row by row, a
column store is organized as a set of columns. This means that when you want to
run analytics on a small number of columns, you can read those columns directly
without consuming memory with the unwanted data. Columns are often of the
same type and benefit from more efficient compression, making reads even
faster. Columnar databases can quickly aggregate the value of a given column
(adding up the total sales for the year, for example). Use cases include analytics.

Unfortunately, there is no free lunch, which means that while columnar


databases are great for analytics, the way in which they write data makes it very
difficult for them to be strongly consistent as writes of all the columns require
multiple write events on disk. Relational databases don't suffer from this problem
as row data is written contiguously to disk.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 49


IT 05
History of NoSQL and Different NoSQL products

Graph Databases

A graph database focuses on the relationship between data elements. Each element is
stored as a node (such as a person in a social media graph). The connections between
elements are called links or relationships. In a graph database, connections are first-class
elements of the database, stored directly. In relational databases, links are implied, using
data to express the relationships.

A graph database is optimized to capture and search the connections between data
elements, overcoming the overhead associated with JOINing multiple tables in SQL.

Very few real-world business systems can survive solely on graph queries. As a result graph
databases are usually run alongside other more traditional databases.

Use cases include fraud detection, social networks, and knowledge graphs.

As you can see, despite a common umbrella, NoSQL databases are diverse in their data
structures and their applications.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 50


IT 05
Short Quiz

1. Which of the following is a NoSQL Database Type?


a) SQL
b) Document databases
c) JSON
d) All of the mentioned

2. Point out the correct statement.


a) Documents can contain many different key-value
pairs, or key-array pairs, or even nested documents
b) MongoDB has official drivers for a variety of popular
programming languages and development
environments
c) When compared to relational databases, NoSQL
databases are more scalable and provide superior
performance
d) All of the mentioned
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 51
IT 05
• Lecture 2

• Mongo DB

• Interfacing and Interacting with NoSQL

• Data types

• NoSQL Storage Architecture

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 52
UNIT 05
Exploring Mongo DB

Objective:
 In this topic we focus on MongoDB which is a source-available cross-
platform document-oriented database program. Classified as a
NoSQL database program, MongoDB uses JSON-like documents with
optional schemas. MongoDB is developed by MongoDB Inc. and
licensed under the Server Side Public License.
 Recap:

 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 53


IT 05
NoSQL Databases: Introduction to NoSQL & MongoDB

About MongoDB

• MongoDB is an open-source document database and leading NoSQL


database. MongoDB is written in C++.
• This study will give you great understanding on MongoDB concepts
needed to create and deploy a highly scalable and performance-
oriented database.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 54


IT 05
NoSQL Databases: Introduction to NoSQL & MongoDB

MongoDB
MongoDB is a cross-platform, document oriented database that provides, high
performance, high availability, and easy scalability. MongoDB works on concept of
collection and document.
Database
Database is a physical container for collections. Each database gets its own set of files on
the file system. A single MongoDB server typically has multiple databases.
Collection
Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A
collection exists within a single database. Collections do not enforce a schema.
Documents within a collection can have different fields. Typically, all documents in a
collection are of similar or related purpose.
Document
A document is a set of key-value pairs. Documents have dynamic schema. Dynamic
schema means that documents in the same collection do not need to have the same set
of fields or structure, and common fields in a collection's documents may hold different
types of data.
The following table shows the relationship of RDBMS terminology with MongoDB.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 55
IT 05
NoSQL Databases: Introduction to NoSQL & MongoDB
RDBMS MongoDB

Database Database

Table Collection

Tuple/Row Document

column Field

Table Join Embedded Documents

Primary Key Primary Key (Default key _id provided by


mongodb itself)

Database Server and Client

Mysqld/Oracle mongod

mysql/sqlplus mongo
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 56
IT 05
Introduction MongoDB

Name comes from “Humongous” & huge data

Written in C++, developed in 2009

Creator: 10gen, former doublick

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 57


IT 05
Introduction MongoDB
MongoDB: Goal

• Goal: bridge the gap between key-value stores (which are fast and
scalable) and relational databases (which have rich functionality).

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 58


IT 05
Introduction MongoDB

What is MongoDB?

• Defination: MongoDB is an open source, document-oriented


database designed with both scalability and developer agility in
mind.

• Instead of storing your data in tables and rows as you would with a
relational database, in MongoDB you store JSON-like documents
with dynamic schemas (schema-free, schemaless).

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 59


IT 05
Introduction MongoDB
What is MongoDB? (Cont’d)

• Document-Oriented DB
– Unit object is a document instead of a row (tuple) in relational
DBs

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 60


IT 05
Introduction MongoDB
Is It Fast?

• For semi-structured & complex relationships: Yes

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 61


IT 05
Introduction MongoDB
It is Growing Fast

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 62


IT 05
Introduction MongoDB
Integration with Others

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 63


IT 05
Introduction MongoDB
NoSQL: Categories

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 64


IT 05
Interfacing and Interacting with NoSQL

Objective:
 In this topic we focus on introducing the essential ways of
interacting with NoSQL data stores. The types of NoSQL stores vary
and so do the ways of accessing and interacting with them. This
topic attempts to summarize a few of the most prominent of these
disparate ways of accessing and querying data in NoSQL databases.
 Recap:

 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 65


IT 05
Interfacing and Interacting with NoSQL

Objective:
 In this topic we focus on MongoDB - Datatypes

1. String − This is the most commonly used datatype to store the data.

2. Integer − This type is used to store a numerical value. ...

3. Boolean − This type is used to store a boolean (true/ false) value.

4. Double − This type is used to store floating point values.

Recap:
 Revision of Nosql Databases.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 66


IT 05
Data types
• MongoDB supports many datatypes. Some of them are −
• String − This is the most commonly used datatype to store the data. String in MongoDB must be UTF-8
valid.
• Integer − This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending upon
your server.
• Boolean − This type is used to store a boolean (true/ false) value.
• Double − This type is used to store floating point values.
• Min/ Max keys − This type is used to compare a value against the lowest and highest BSON elements.
• Arrays − This type is used to store arrays or list or multiple values into one key.
• Timestamp − ctimestamp. This can be handy for recording when a document has been modified or
added.
• Object − This datatype is used for embedded documents.
• Null − This type is used to store a Null value.
• Symbol − This datatype is used identically to a string; however, it's generally reserved for languages
that use a specific symbol type.
• Date − This datatype is used to store the current date or time in UNIX time format. You can specify your
own date time by creating object of Date and passing day, month, year into it.
• Object ID − This datatype is used to store the document’s ID.
• Binary data − This datatype is used to store binary data.
• Code − This datatype is used to store JavaScript code into the document.
• Regular expression − This datatype is used to store regular expression.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 67
IT 05
Data types
Data Model

 BSON format (binary JSON)

 Developers can easily map to modern object-oriented languages


without a complicated ORM layer.

 lightweight, traversable, efficient

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 68


IT 05
Data types
Terms Mapping (DB vs. MongoDB)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 69


IT 05
Data types
JSON
Field Name
Field Value

• Field Value
– Scalar (Int, Boolean, String,
One document
Date, …)

– Document (Embedding or
Nesting)

– Array of JSON objects

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 70


IT 05
Data types
Another Example

Remember it is stored in
binary formats (BSON)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 71


IT 05
Data types
MongoDB Model

One document (e.g., one tuple in RDBMS) • Collection is a group of


similar documents

• Within a collection, each


document must have a
unique Id

One Collection (e.g., one Table in RDBMS)


Unlike RDBMS:
No Integrity Constraints in
MongoDB

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 72


IT 05
Data types
MongoDB Model

One document (e.g., one tuple in RDBMS)


• The field names cannot start
with the $ character

• The field names cannot


contain the . character

One Collection (e.g., one Table in RDBMS) • Max size of single document
16MB

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 73


IT 05
Data types

Example Document in MongoDB


• _id is a special column in each
document

• Unique within each collection

• _id  Primary Key in RDBMS

• _id is 12 Bytes, you can set it


yourself

• Or:
• 1st 4 bytes  timestamp
• Next 3 bytes  machine id
• Next 2 bytes  Process id
• Last 3 bytes  incremental
values
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 74
IT 05
NoSQL Storage Architecture

Objective:
 In this topic we focus on the NoSQL database approach which is
characterized by a move away from the complexity of SQL based
servers. The logic of validation, access control, mapping querieable
indexed data, correlating related data, conflict resolution,
maintaining integrity constraints, and triggered procedures is moved
out of the database layer.
 Recap:

 Revision of RDBMS architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 75


IT 05
NoSQL Storage Architecture

• Architecture Pattern is a logical way of categorizing data that will be


stored on the Database. NoSQL is a type of database which helps to
perform operations on big data and store it in a valid format. It is widely
used because of its flexibility and a wide variety of services.

• Architecture Patterns of NoSQL:


• The data is stored in NoSQL in any of the following four data architecture
patterns.

• 1. Key-Value Store Database


• 2. Column Store Database
• 3. Document Database
• 4. Graph Database

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 76


IT 05
NoSQL Storage Architecture

• These are explained as following below.

• 1. Key-Value Store Database:


• This model is one of the most basic models of NoSQL databases. As the
name suggests, the data is stored in form of Key-Value Pairs. The key is
usually a sequence of strings, integers or characters but can also be a more
advanced data type. The value is typically linked or co-related to the key.
The key-value pair storage databases generally store data as a hash table
where each key is unique. The value can be of any type (JSON, BLOB(Binary
Large Object), strings, etc). This type of pattern is usually used in shopping
websites or e-commerce applications.

• Advantages:
• Can handle large amounts of data and heavy load,
• Easy retrieval of data by keys.
• Limitations:

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 77


IT 05
NoSQL Storage Architecture

• Complex queries may attempt to involve multiple key-value pairs which may
delay performance.
• Data can be involving many-to-many relationships which may collide.
• Examples:

• DynamoDB
• Berkeley DB

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 78


IT 05
NoSQL Storage Architecture

• 2. Column Store Database:


• Rather than storing data in relational tuples, the data is stored in individual cells
which are further grouped into columns. Column-oriented databases work only on
columns. They store large amounts of data into columns together. Format and
titles of the columns can diverge from one row to other. Every column is treated
separately. But still, each individual column may contain multiple other columns
like traditional databases.
• Basically, columns are mode of storage in this type.

• Advantages:

• Data is readily available


• Queries like SUM, AVERAGE, COUNT can be easily performed on columns.
• Examples:

• HBase
• Bigtable by Google
• Cassandra
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 79
IT 05
NoSQL Storage Architecture

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 80


IT 05
NoSQL Storage Architecture

3. Document Database:
The document database fetches and accumulates data in form of key-value pairs
but here, the values are called as Documents. Document can be stated as a
complex data structure. Document here can be a form of text, arrays, strings,
JSON, XML or any such format. The use of nested documents is also very
common. It is very effective as most of the data created is usually in form of
JSONs and is unstructured.
Advantages:
This type of format is very useful and apt for semi-structured
data.
Storage retrieval and managing of documents is easy.
Limitations:
Handling multiple documents is challenging
Aggregation operations may not work accurately.
Examples:
MongoDB
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 81
CouchDB IT 05
NoSQL Storage Architecture

Figure – Document Store Model in form of JSON documents

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 82


IT 05
NoSQL Storage Architecture
• 4. Graph Databases:
• Clearly, this architecture pattern deals with the storage and management of
data in graphs. Graphs are basically structures that depict connections
between two or more objects in some data. The objects or entities are called
as nodes and are joined together by relationships called Edges. Each edge has
a unique identifier. Each node serves as a point of contact for the graph. This
pattern is very commonly used in social networks where there are a large
number of entities and each entity has one or many characteristics which are
connected by edges. The relational database pattern has tables that are
loosely connected, whereas graphs are often very strong and rigid in nature.

• Advantages:
• Fastest traversal because of connections.
• Spatial data can be easily handled.
• Limitations:
• Wrong connections may lead to infinite loops.

•12/27/2024
Examples: Mr. Rajkumar Gupta ACSAI0402 DBMS UN 83
IT 05
NoSQL Storage Architecture

Figure – Graph model format of NoSQL Databases

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 84


IT 05
Short Quiz

1. Which of the following is a wide-column store?


a) Cassandra
b) Riak
c) MongoDB
d) Redis

2. Point out the correct statement


a) MongoDB is classified as a NoSQL database
b) MongoDB favours XML format more than JSON
c) MongoDB is column oriented database store
d) None of the mentioned

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 85


IT 05
• Lecture 3

• CRUD operations with MongoDB

• Querying, Modifying and Managing NoSQL Data stores

• Creating, Updating and Deleing documents & Querying

• Indexing and ordering datasets (MongoDB)

• Capped Collections

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 86
UNIT 05
CRUD operations with MongoDB

Objective:
 In this topic we focus on CRUD Meaning: CRUD is an acronym that
comes from the world of computer programming and refers to the
four functions that are considered necessary to implement a
persistent storage application: create, read, update and delete.
 Recap:

 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 87


IT 05
CRUD operations with MongoDB
Must Practice It

Install it Practice simple stuff Move to complex stuff

Install it from here: http://www.mongodb.org

Manual: http://docs.mongodb.org/master/MongoDB-manual.pdf
(Focus on Ch. 3, 4 for now)

Dataset: http://docs.mongodb.org/manual/reference/bios-example-collection/

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 88


IT 05
CRUD operations with MongoDB
CRUD

• Create
– db.collection.insert( <document> )
– db.collection.save( <document> )
– db.collection.update( <query>, <update>, { upsert: true } )
• Read
– db.collection.find( <query>, <projection> )
– db.collection.findOne( <query>, <projection> )
• Update
– db.collection.update( <query>, <update>, <options> )
• Delete
– db.collection.remove( <query>, <justOne> )

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 89


IT 05
CRUD operations with MongoDB
CRUD Examples

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 90


IT 05
Querying, Modifying and Managing NoSQL Data stores

Objective:
 In this topic we focus on Most NoSQL and NewSQL data stores
which implement some sort of horizontal partitioning or sharding,
which involves storing sets or rows/records into different segments
(or shards) which may be located on different servers.
 Recap:

 Revision of Database Management Systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 91


IT 05
Querying, Modifying and Managing NoSQL Data stores
Examples

In RDBMS In MongoDB
Either insert the 1st document

Or create “Users” collection explicitly

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 92


IT 05
Creating, Updating and Deleing documents
& Querying
Insertion

• The collection “users” is created automatically if it does not exist

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 93


IT 05
Creating, Updating and Deleing documents
& Querying
Multi-Document Insertion
(Use of Arrays)

All the documents are


inserted at once

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 94


IT 05
Creating, Updating and Deleing documents
& Querying
Multi-Document Insertion
(Bulk Operation)
• A temporary object in memory
• Holds your insertions and uploads them at There is also Bulk Ordered object
once

_id column is added


automatically

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 95


IT 05
Creating, Updating and Deleing documents
& Querying
Deletion
(Remove Operation)

• You can put condition on any field in the document (even _id)

db.users.remove ( ) Removes all documents from users collection

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 96


IT 05
Creating, Updating and Deleing documents
& Querying
Update

Otherwise, it will update only the 1st matching document

Equivalent to in SQL:

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 97


IT 05
Creating, Updating and Deleing documents
& Querying
Update (Continued)

Two
operators

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 98


IT 05
Creating, Updating and Deleing documents
& Querying
Replace a document

Query Condition

New
doc

For the document having item = “BE10”, replace it with the given document

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 99


IT 05
Creating, Updating and Deleing documents
& Querying
Insert or Replace

The upsert option

If the document having item = “TBD1” is in the DB, it will be replaced


Otherwise, it will be inserted.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 100


IT 05
Creating, Updating and Deleing documents
& Querying

Any relational database has a typical schema design that


shows number of tables and the relationship between these
tables. While in MongoDB, there is no concept of relationship.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 101


IT 05
Creating, Updating and Deleing documents
MongoDB Create Database
& Querying

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 102


IT 05
Creating, Updating and Deleing documents
MongoDB Drop Database
& Querying

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 103


IT 05
Creating, Updating and Deleing documents
MongoDB Create Collection
& Querying

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 104


IT 05
Creating, Updating and Deleing documents
& Querying

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 105


IT 05
Indexing and ordering datasets (MongoDB)

Objective:
 In this topic we focus on MongoDB uses multikey indexes to index
the content stored in arrays. When you index on a column that holds
an array value, MongoDB creates separate index entries for every
element of the array. These multikey indexes allow queries to select
documents that contain arrays by matching on element or elements
of the arrays.

Recap:
 Revision of DBMS architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 106


IT 05
Indexing and ordering datasets (MongoDB)

• Indexes support the efficient resolution of queries. Without indexes, MongoDB must
scan every document of a collection to select those documents that match the query
statement. This scan is highly inefficient and require MongoDB to process a large
volume of data.

• Indexes are special data structures, that store a small portion of the data set in an
easy-to-traverse form. The index stores the value of a specific field or set of fields,
ordered by the value of the field as specified in the index.

• The createIndex() Method


• To create an index, you need to use createIndex() method of MongoDB.

• Syntax
• The basic syntax of createIndex() method is as follows().

• >db.COLLECTION_NAME.createIndex({KEY:1})
• Here key is the name of the field on which you want to create index and 1 is for
ascending order. To create index in descending order you need to use -1.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 107
IT 05
Indexing and ordering datasets (MongoDB)
Example
>db.mycol.createIndex({"title":1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
>
In createIndex() method you can pass multiple fields, to create index on multiple
fields.

>db.mycol.createIndex({"title":1,"description":-1})
>

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 108


IT 05
Indexing and ordering datasets (MongoDB)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 109


IT 05
Indexing and ordering datasets (MongoDB)

The dropIndex() method


You can drop a particular index using the dropIndex() method of MongoDB.
Syntax
The basic syntax of DropIndex() method is as follows().
>db.COLLECTION_NAME.dropIndex({KEY:1})
Here key is the name of the file on which you want to create index and 1 is for ascending order. To
create index in descending order you need to use -1.
Example
> db.mycol.dropIndex({"title":1})
{
"ok" : 0,
"errmsg" : "can't find index with key: { title: 1.0 }",
"code" : 27,
"codeName" : "IndexNotFound"
}
The dropIndexes() method
This method deletes multiple (specified) indexes on a collection.
Syntax
The basic syntax of DropIndexes() method is as follows() −

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 110


IT 05
Indexing and ordering datasets (MongoDB)

>db.COLLECTION_NAME.dropIndexes()
Example
Assume we have created 2 indexes in the named mycol collection as shown below −
> db.mycol.createIndex({"title":1,"description":-1})
Following example removes the above created indexes of mycol −
>db.mycol.dropIndexes({"title":1,"description":-1})
{ "nIndexesWas" : 2, "ok" : 1 }
>The getIndexes() method
This method returns the description of all the indexes int the collection.
Syntax
Following is the basic syntax od the getIndexes() method −
db.COLLECTION_NAME.getIndexes()
Example
Assume we have created 2 indexes in the named mycol collection as shown below −

> db.mycol.createIndex({"title":1,"description":-1})

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 111


IT 05
Indexing and ordering datasets (MongoDB)

Objective:
 In this topic we focus on Capped collections are fixed-size collections
that support high-throughput operations that insert and retrieve
documents based on insertion order.

Recap:
 Revision of NOSql architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 112


IT 05
Capped Collections
Overview
Capped collections are fixed-size collections that support high-throughput operations that insert and
retrieve documents based on insertion order. Capped collections work in a way similar to circular
buffers: once a collection fills its allocated space, it makes room for new documents by overwriting
the oldest documents in the collection.
Behavior
Insertion Order
Capped collections guarantee preservation of the insertion order. As a result, queries do not need an
index to return documents in insertion order. Without this indexing overhead, capped collections can
support higher insertion throughput.

Automatic Removal of Oldest Documents


To make room for new documents, capped collections automatically remove the oldest documents in
the collection without requiring scripts or explicit remove operations.

Consider the following potential use cases for capped collections:

Store log information generated by high-volume systems. Inserting documents in a capped collection
without an index is close to the speed of writing log information directly to a file system.
Furthermore, the built-in first-in-first-out property maintains the order of events, while managing
storage use.
Cache small amounts of data in a Mr.
12/27/2024
capped collections. Since caches are read rather than write heavy,
Rajkumar Gupta ACSAI0402 DBMS UN 113
you would either need to ensure that
IT 05this collection always remains in the working set (i.e. in RAM) or
Capped Collections

_id Index
Capped collections have an _id field and an index on the _id field by default.
Restrictions and Recommendations Updates
If you plan to update documents in a capped collection, create an index so that these update
operations do not require a collection scan.
Document Size Changed in version 3.2.
If an update or a replacement operation changes the document size, the operation will fail.
Document Deletion
You cannot delete documents from a capped collection. To remove all documents from a collection,
use the drop() method to drop the collection and recreate the capped collection.
Sharding
You cannot shard a capped collection.
Query Efficiency
Use natural ordering to retrieve the most recently inserted elements from the collection efficiently.
This is similar to using the tail command on a log file.
Aggregation $out
The aggregation pipeline stage $out cannot write results to a capped collection.
Transactions
Starting in MongoDB 4.2, you cannot write to capped collections in transactions. Reads from capped
collections are still supported in transactions.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 114
IT 05
Capped Collections
• Procedures
• Create a Capped Collection
• You must create capped collections explicitly using the db.createCollection()
method, which is a helper in the mongo shell for the create command. When
creating a capped collection you must specify the maximum size of the
collection in bytes, which MongoDB will pre-allocate for the collection. The size
of the capped collection includes a small amount of space for internal overhead.
• db.createCollection( "log", { capped: true, size: 100000 } )
• If the size field is less than or equal to 4096, then the collection will have a cap
of 4096 bytes. Otherwise, MongoDB will raise the provided size to make it an
integer multiple of 256.
• Additionally, you may also specify a maximum number of documents for the
collection using the max field as in the following document:

• db.createCollection("log", { capped : true, size : 5242880, max : 5000 } )

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 115


IT 05
Capped Collections

• Query a Capped Collection


• If you perform a find() on a capped collection with no ordering specified,
MongoDB guarantees that the ordering of results is the same as the insertion
order.
• To retrieve documents in reverse insertion order, issue find() along with the sort()
method with the $natural parameter set to -1, as shown in the following example:
• db.cappedCollection.find().sort( { $natural: -1 } )
• Check if a Collection is Capped
• Use the isCapped() method to determine if a collection is capped, as follows:
• db.collection.isCapped()
• Convert a Collection to Capped
• You can convert a non-capped collection to a capped collection with the
convertToCapped command:

• db.runCommand({"convertToCapped": "mycoll", size: 100000});

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 116


IT 05
Capped Collections

• The size parameter specifies the size of the capped collection in bytes.

• This holds a database exclusive lock for the duration of the operation. Other
operations which lock the same database will be blocked until the operation
completes. See What locks are taken by some common client operations? for
operations that lock the database.

• Tailable Cursor
• You can use a tailable cursor with capped collections. Similar to the Unix tail -f
command, the tailable cursor "tails" the end of a capped collection. As new
documents are inserted into the capped collection, you can use the tailable
cursor to continue retrieving documents.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 117


IT 05
Short Quiz

1. In CRUD Operator, U is an acronym of –

a) Upper
b) Unique
c) Update
d) Uppercase

2. Read in CRUD Operator means –

a) To retrieve data
b)To fetch data
c)Both A. and B.
d)None of the above

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 118


IT 05
• Lecture 4

• Introduction of Cloud database

• NoSQL with Cloud Database

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 119
UNIT 05
Cloud database: - Introduction of Cloud database

Objective:
 In this topic we focus on cloud database which is a database service
built and accessed through a cloud platform. It serves many of the
same functions as a traditional database with the added flexibility of
cloud computing. Users install software on a cloud infrastructure to
implement the database.
 Recap:

 Revision of Cloud Architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 120


IT 05
Cloud database: - Introduction of Cloud database

A cloud database is a database service built and accessed through a cloud


platform. It serves many of the same functions as a traditional database with
the added flexibility of cloud computing. Users install software on a cloud
infrastructure to implement the database.

Key features:

A database service built and accessed through a cloud platform


Enables enterprise users to host databases without buying dedicated
hardware
Can be managed by the user or offered as a service and managed by a
provider
Can support relational databases (including MySQL and PostgreSQL) and
NoSQL databases (including MongoDB and Apache CouchDB)
Accessed through a web interface or vendor-provided API

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 121


IT 05
Cloud database: - Introduction of Cloud database

• Why cloud databases


• Ease of access
• Users can access cloud databases from virtually anywhere, using a
vendor’s API or web interface.

• Scalability
• Cloud databases can expand their storage capacities on run-time to
accommodate changing needs. Organizations only pay for what they use.

• Disaster recovery
• In the event of a natural disaster, equipment failure or power outage,
data is kept secure through backups on remote servers.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 122


IT 05
Cloud database: - Introduction of Cloud database

• Considerations for cloud databases


• Control options

• Users can opt for a virtual machine image managed like a traditional database or a provider’s
database as a service (DBaaS).

• Database technology

• SQL databases are difficult to scale but very common. NoSQL databases scale more easily but
do not work with some applications.

• Security

• Most cloud database providers encrypt data and provide other security measures;
organizations should research their options.

• Maintenance

• When using a virtual machine image, one should ensure that IT staffers can maintain the
underlying infrastructure.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 123
IT 05
Cloud database: - Introduction of Cloud database

• WHAT IS A CLOUD DATABASE?


• let’s dig deeper into the cloud-based world that we are living in. So, cloud database services include
everything from storing all kinds of data required to providing access and delivering the data to the
required parties involved. Therefore, as mentioned above, it is storing the data on the internet and
is normally of three kinds.

• Platform as a service(PaaS)

• Software as a service(SaaS)

• Infrastructure as a service(IaaS)

• Platform as a service or PaaS is the most common type here, providing the provision of servers,
data storage, and operating systems. It helps in the storage and acts as a platform for the virtual
database, saving the hardware cost and helping to access the data from all around the world.

• SaaS, on the other hand, provides the entire software as a service to the organization in exchange
for an amount and is an excellent business option for all those organizations involving a lot of web
users.

• IaaS helps to provide a complete infrastructure


12/27/2024 where
Mr. Rajkumar Gupta the business
ACSAI0402 DBMS UNcan run their applications. 124
IT 05
Cloud database: - Introduction of Cloud database

CLOUD DATABASE TECHNOLOGIES LIST

• CLOUCloud computing is on a rise because of the flexibility and the ease of services
that it provides. Several well-known IT giants are planning to capture the market.
Most of the cloud databases run on the well-known cloud computing platforms like
Rackspace, salesforce, GoGrid, and Amazon EC2.

• Here are the top five most beneficial cloud services for data storage.

• Amazon Web Services or AWS- AWS needs no introduction as it is already counted as


one of the top cloud database technologies.
• Azure by Microsoft- This is Microsoft’s entry into the cloud space which has already
gained a lot of momentum.
• Oracle Database cloud- Everyone has heard about Oracle because of its traditional
database system, and now it is capturing the cloud storage space.
• SAP- SAP is the giant when it comes to offering software for enterprises and now is
ready for cloud storage with its platform called HANA.D DATABASE TECHNOLOGIES
LIST
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 125
IT 05
NoSQL with Cloud Database

Objective:
 In this topic we focus on NoSQL databases are specifically designed
for low cost commodity hardware. These databases are mostly used
for storage and access of data across multiple storage cluster. For
example Google, Facebook, Google+, Google big table, Amazon
Dynamo, Twitter etc. collects and stores Terabytes of data for their
user every day.
 Recap:

 Revision of Cloud Architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 126


IT 05
NoSQL with Cloud Database

• What is a cloud database?


• A “cloud database” can be one of two distinct things: a traditional or NoSQL
database installed and running on a cloud virtual machine (be it public cloud,
private cloud, or hybrid cloud platforms), or a cloud provider’s fully managed
database-as-a-service (DBaaS) offering. The former, running your own self-
managed database in a cloud environment, is really no different from
operating a traditional database. Cloud DBaaS, on the other hand, is the
natural database equivalent of software-as-a-service (SaaS): pay as you go,
and only for what you use, and let the system handle all the details of
provisioning and scaling to meet demand, while maintaining consistently high
performance.
• Cloud database options:
• Traditional database running on cloud virtual machine (VM)
• Fully managed database-as-a-service

Most of the time (and for most of the remainder of this page), the
term “cloud database” refers to a cloud-based database-as-a-service.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 127
IT 05
NoSQL with Cloud Database
• Why use a cloud database/DBaaS?
• The key benefits of cloud databases are that they are accessible from anywhere, scalable from day one,
and designed for reliability and performance.
• Common cloud database use cases
• Cloud databases work in most cases that traditional databases do. They are particularly valuable when
building software products that:

• Are cloud-native

• Require large volume of data

• Need to handle high scale traffic

• Are distributed geographically

• Data applications that take advantage of centralization, like legacy modernization


and analytics, are also fantastic candidates for cloud database usage.
• While certain use cases are more obvious candidates for cloud database usage,
more traditional use cases, like real-time online transaction processing, caching, or
data warehousing work just as well in the fully managed paradigm.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 128


IT 05
NoSQL with Cloud Database

• Cloud database considerations


• Whether you’re still thinking about whether a cloud database is
right for you, or in the process of selecting the ideal database-as-a-
service for your needs, there are a few key factors to take into
consideration:

• Cloud Database Providers

• Database Technology

• Management System

• Cost Model

• Security

• Extras
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 129
IT 05
NoSQL with Cloud Database

• MongoDB Atlas cloud database


• MongoDB can be installed and run on any cloud provider or on-premise network as
a self-managed database cluster or virtual machine, or on AWS, GCP, or Azure using
MongoDB Atlas, our cloud database-as-a-service (DBaaS) offering. There are major
benefits to adopting the DBaaS option, including:

• Simplified management

• Elastic autoscaling

• Redundancy, backup, and restore

• Charts

• Connectors

• Schema navigator
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 130
IT 05
NoSQL with Cloud Database

• MongoDB Atlas, part of MongoDB’s broader data-as-a-service (DaaS)


development platform, is a powerful and compelling alternative to managing
your own NoSQL, or traditional, database, or using a cloud provider-specific
managed offering.

• The way a cloud database works is that rather than installing, configuring,
and maintaining a database instance or instances, an automated system is
able to provision, manage, and scale the underlying database cluster for you.

• Fully managed database services handle the complexities of maintaining a


consistently available, high performance cluster in a way that allows you, the
developer, to access it as a simple, globally available resource.

• You can treat the cluster as a single database instance, covered by a


transparent usage-based pricing model, so you’re never worrying about over-
or under-provisioning.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 131


IT 05
• Lecture 5
• Introduction to Real time Database.
• RTDBS Structure
• Services and Examples
• System Models and Timing Deadlines
• Scheduling
• Synchronization
• Serializability

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 132
UNIT 05
Introduction to Real time Database.

Objective:
 In this topic we focus on cloud database which is a database service
built and accessed through a cloud platform. It serves many of the
same functions as a traditional database with the added flexibility of
cloud computing. Users install software on a cloud infrastructure to
implement the database.
 Recap:

 Revision of Cloud Architecture.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 133


IT 05
Introduction to Real time Database.

Definition

• Real-Time Data Base System can be defined as those computing systems


that are designed to operate in a timely manner.

• It must perform certain actions within specific timing constrains


(producing results while meeting predefined deadlines)

• Real-Time Data Base System can also be defined as Traditional Databases


that uses an extension to give additional power to yield reliable response.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 134


IT 05
Introduction to Real time Database.
RTDBS Structure

• Typical Real-Time Bata Base System consists of:


– Controlled System : the underlying application
– Controlling System:
• A Computer monitoring the state of the environment
• Supplying the environment with the appropriate driving
signals.
• The state of the environment as perceived by the controlling system
must be consistent with the actual state of the environment.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 135


IT 05
Introduction to Real time Database.
Specifications
• Effective RTBDS must consider:
– Temporal-consistency: maintaining consistency between the
actual state of the environment and the state as reflected or
perceived by the system.
– Deadlines: timing constrains which must be met in addition to the
desired computations
– Priority Scheduling: policy for ordering the execution of the
outstanding processor according to some predefined criteria.
• As a conclusion, Real Time Data Base Systems correctness do not only
depends on the logical correctness, but on the timeliness of its
actions

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 136


IT 05
Introduction to Real time Database.
Services and Examples

• Telecommunication Systems
– Routers and network management systems
– Telephone switching systems
• Control Systems
– Automatic tracking and object positioning
– Engine control in automobiles
• Multimedia servers for real-time streaming
• E-commerce and e-buisness
– Stock market: program stock trading
– Financial services: credit card transactions
• Web-based data services

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 137


IT 05
Introduction to Real time Database.

System Models and Timing


Deadlines

• Soft-Deadline:
– desirable but not critical
– missing a soft-deadline does not cause a system failure or
compromises the system’s integrity
– Example: operator switchboard for a telephone

v(t)
Soft deadline

v0

12/27/2024 d1 ACSAI0402 DBMS


Mr. Rajkumar Gupta d2 UN t 138
IT 05
Introduction to Real time Database.
Deadlines

• Firm-Deadline:
– Desirable but not critical (like Soft-Deadline case)
– It is not executed after its deadline and no value is gained by the
system from the tasks that miss their deadlines
– Example: an autopilot system

v(t)

Firm deadline
v0

d t
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 139
IT 05
Introduction to Real time Database.
Deadlines
• Hard-Deadline:
– Timely and logically correct execution is considered to be critical
– Missing a hard-deadline can result in catastrophic consequences
– Also known as Safety-Critical
– Example: data gathered by a sensor

v(t)

v0
Hard deadline

12/27/2024 d
Mr. Rajkumar Gupta ACSAI0402 DBMS UN t 140
IT 05
Introduction to Real time Database.
Design Paradigms

• Time-Triggered (TT)
– Systems are initiated as predefined instances
– Assessments of resource requirements and resource availability is
required
– TT architecture can provide predictable behavior due to its pre-planed
execution pattern.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 141


IT 05
Introduction to Real time Database.
Design Paradigms
• Event-Triggered (ET)
– Systems are initiated in response to the occurrence of
particular events that are possibly caused by the environment
– The resource-need assessments in ET architecture is usually
probabilistic
– ET is not as reliable as TT but provides more flexibility and ideal
for more classes of applications
– ET behavior usually is not predictable.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 142


IT 05
Introduction to Real time Database.
Tasks Periodicity

• Prosodic Tasks
– Executes at regular intervals of time
– Corresponds to TT architecture
– Have Hard-Deadlines characterized by their periods (requires worst-
case analysis).

• Aperiodic Tasks
– Execution time cannot be priori anticipated
– Activation of tasks is random event caused by a trigger
– Corresponds to ET architecture
– Have Soft-Deadlines (no worst-case analysis)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 143


IT 05
Introduction to Real time Database.
Tasks Periodicity

• Sporadic Tasks
• Tasks which are aperiodic in nature, but have Hard-Deadlines
• Used to handle emergency conditions or exceptional situations
• Worst-case calculations is done using Schedulability-Constraint
• Schedulability-Constraint defines a minimum period between any two
sporadic events from the same source.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 144


IT 05
Introduction to Real time Database.

Scheduling

• Each task within a real-time system has


– Deadline
– An arrival time
– Possibly an estimated worst-case execution
• A Scheduler can be defined as an algorithm or policy for ordering the
execution of the outstanding process
• Scheduler maybe:
– Preemptive
• Can arbitrarily suspend and resume the execution of the task
without affecting its behavior

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 145


IT 05
Introduction to Real time Database.

Scheduling (Cont)

– Non-preemptive
• A task must be rum without interruption until completion
• Hybrid
– Preemptive scheduler, but preemption is only allowed at
certain points within the code of each task.
• Real-Time scheduling algorithms can be :
– Static
» Known as fixed-priority where priorities are computed
off-line
» Requires complete priori knowledge of the real-time
environment in which is deployed
» Inflexible: scheme is workable only if all the tasks are
effectively periodic.
» Can work only for simple systems, performs inconsistently
as the load increases.
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 146
IT 05
Introduction to Real time Database.

Scheduling (Continue)

• Dynamic
– Assumes unpredictable task-arrival times
– Attempts to schedule tasks dynamically upon arrival
– Dynamically computes and assigns a priority value to each
task
– Decisions are based on task characteristics and the current
state of the system
– Flexible scheduler that can deal with unpredictable events.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 147


IT 05
Introduction to Real time Database.

Priority-Based Scheduling

• Conventional scheduling algorithms aims at balancing the


number of CPU-bound and I/O bound jobs to maximize
system utilization and throughput
• Real-Time tasks need to be scheduled according to their
criticalness and timeliness
• Real-Time system must ensure that the progress of higher-
priority tasks (ideally) is never hindered by lower-priority
tasks.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 148


IT 05
Introduction to Real time Database.

Priority-Based Scheduling
Methods

• Earliest-Deadline-First (EDF):
• the task with the current closest (earliest) deadline is
assigned the highest priority in the system and executed
next
• Value-Functions : highest value (benefit) first
• the scheduler is required to assign priorities as well as
defining the system values of completing each task at any
instant in time

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 149


IT 05
Introduction to Real time Database.
Priority-Based Scheduling
Methods
• Value-Density (VD): highest (value/computation) first
• The scheduler tends to select the tasks that earn more
value per time unit they consume
• It is a greedy technique since it always schedules that task
that has the highest expected value within the shortest
possible time unit.
• Complex functions of deadline, value and slack time.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 150


IT 05
Introduction to Real time Database.
Synchronization

• Priority inversion problem: a higher-priority task can be blocked by


a lower-priority task possibly for an unbounded number of times
and for unbounded periods.
• Solutions:
– The Priority Inheritance Protocol
• execute the blocking transaction (low priority) with the
priority of the blocked transaction (high priority)
• The task inherits the highest priority level of all the tasks it
blocks and executes its resource (critical section)
• “intermediate” blocking is eliminated

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 151


IT 05
Introduction to Real time Database.

Synchronization (Continue)

• Priority Abort Protocol


– abort the low priority transaction - no blocking at all
– quick resolution, but wasted resources
• Conditional Priority Inheritance Protocol
– based on the estimated length of transaction
– inherit the priority only if blocking one is close to completion;
otherwise abort.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 152


IT 05
Introduction to Real time Database.
Real Time Database Systems
Overview
• Topics related to design of RTDBS in a centralized uni-processor system:
– RTDBS System Models
– Scheduling RTDB Transactions
• Concurrency Control
• Conflict Resolution
• Deadlocks
– Admission Control
– Memory Management
– I/O and Disk Scheduling

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 153


IT 05
Introduction to Real time Database.
Conventional Databases:
Transactions and Serializability
• Transaction: is a collection of read and write operations which comprises a
consistent transformation of the system state.
• When executed alone, each transaction transforms a consistent state into
a new consistent state
• Transactions preserve consistency of the database information
• Schedule: a particular sequencing of the actions from different
transactions.
• Consistent Schedule: a schedule that gives each transaction a consistent
view of the database-state.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 154


IT 05
Introduction to Real time Database.
Conventional Databases:
Transactions and Serializability

• Database inconsistencies can be caused by:


– Failures
– Concurrency
• Four properties associated with transactions known as ACID properties are
used to prevent such problems

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 155
UNIT 05
Introduction to Real time Database.

Conventional Databases:
ACID Properties

A Atomicity: Either all or none of the transactions


operations are/is performed. All the operations of a
transaction are treated as a single, indivisible, atomic
unit.

C Consistency: A transaction maintains the integrity


constraints on the database.

I Isolation: Transactions can execute concurrently but


with no interference with each other’s operations.

D Durability: All changes made by a committed


transaction become permanent in the database,
surviving any subsequent failures.
Mr. Rajkumar Gupta ACSAI0402 DBMS
12/27/2024 156
UNIT 05
Introduction to Real time Database.

Conventional Databases:
ACID Properties (Cont.)

• Consistency of database is preserved by each


transaction
• Recovery Protocols are used to ensure the Atomicity and
Durability properties
• The difficulty of dealing with traditional transactions
that different execution paths have significantly different
requirement
• Concurrent execution may violate the database integrity
constrains regardless of the correctness of individual
transactions.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 157


IT 05
Introduction to Real time Database.
Serializability

• An execution is said to be serializiable if it produces the same output and


has the same effect on the database as some serial execution of the same
transactions.
• Serializability is a notion of correctness in any DBMS
• Conflict-Serializability:
– the simplest and most common form of Serializability
– ensures that conflicting operations appear in the same order in two
equivalent executions
– Conflicts can happen in case of read and write operations on the same
data object.
• View Serializability
– Two executions are equivalent if each transaction reads the same
values in the two executions.
– Final value of the databases is the same in both executions

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 158


IT 05
Introduction to Real time Database.
Recoverable History

• Cascading-Aborts: If a transaction Tj reads a value that was last written by


an aborted transaction Ti, then Tj must also be aborted

• To keep Durability, once a transaction commits, it could not subsequently


be aborted nor its effects changed due to cascading-aborts.

• to assure Atomicity and Durability, an execution must be Recoverable

• An execution is Recoverable if, once a transaction is committed, the


transaction is guaranteed not to be involved in cascading aborts.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 159


IT 05
Introduction to Real time Database.

Recoverable History (Cont)

• Cascadeless: Read only committed written data. That is, if transaction Tj


reads from Ti, then Ti must be an already committed transaction; i.e.,
– Wi [x] → Rj [x] ⇒ Ci → Cj

• Strict: Read and write only committed written data. That is, if transaction
Tj reads from Ti, or overwrites a data item that was last written by Ti, then
Ti must be an already committed transaction; i.e.,
– Wi [x] → Rj [x] ⇒ Ci → Cj
Wj [x] ⇒ Ci → Cj

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 160


IT 05
Introduction to Real time Database.

RTBBS vs. Conventional DB

• Conventional Transactions  Real-Time Transactions


• Logically correct and  Logically correct and
consistent (ACID): consistent (ACID)
– atomicity  “Approximately correct”
– consistency  trade quality or
– isolation correctness for timeliness
– durability  Time correctness
 time constraints on
transactions
 temporal constraints on
data

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 161


IT 05
Introduction to Real time Database.
Conventional DB vs. RTDBS

• Conventional Databases: Real-Time Database Systems:


• Logical consistency  Logical consistency
 ACID properties (may be
– ACID properties of
relaxed)
transactions:
 Data integrity constraints
• Atomicity
 Enforce time constraints
• Isolation  Deadlines of transaction
• Consistency  External consistency
• Durability  absolute validity interval
– Data integrity constraints (AVI)
 Temporal consistency
 relative validity interval (RVI)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 162


IT 05
Introduction to Real time Database.
Conventional DB vs. RTDBS

• Real-time systems
• Task centric
– Deadlines attached to tasks

• Real-time databases
• Data centric
– Data has temporal validity, i.e., deadlines also attached to data
– Transactions must be executed by deadline to keep the data
valid, in addition to produce results in a timely manner

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 163


IT 05
Introduction to Real time Database.
A Real-Time Database Model

Real-Time Database Model


12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 164
IT 05
Introduction to Real time Database.

A Real-Time Database Model

• Any new transaction must pass through an Admission Control mechanism,


which monitors and regulates the total number of concurrently active
transactions within the system in order to avoid thrashing

• Every new or resubmitted transaction is assigned a Priority Level, which


orders its scheduling preference relative to the other concurrent
transactions within the system

• Before a transaction performs an operation on a data object, it must go


through the Concurrency Control component in order to achieve the
required synchronization. If the transaction’s request for a granule is
denied, the transaction will be placed into a Wait Queue.

• The waiting transaction will be reactivated when the requested granule


becomes available, after which the transaction performs its operation.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 165


IT 05
Introduction to Real time Database.

A Real-Time Database Model

• Similarly, if a transaction requests an item that is currently not in main-


memory, an I/O request is initiated and the transaction will be placed into
a wait queue.

• The waiting transaction will be reactivated when the requested granule


becomes available in main-memory, and there is no active higher-priority
transaction.

• When a transaction completes all of its operations, it commits its result(s)


and releases all of the data items in its possession.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 166


IT 05
Introduction to Real time Database.
A Real-Time Database Model

• A transaction may abort/restart a number of times before it commits.


There are various types of aborts :

– Terminating abort:
• An abort due to missing a deadline, or
• Self-abort – a transaction may abort itself due to an exceptional
condition.

– Non-terminating abort: An abort due to a deadlock or a data conflict.


In this case, the transaction maybe restarted if its deadline remains
feasible.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 167


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions

• A special feature of RTDB systems, in addition to standard physical


resources, is the data objects stored in the database, and transactions
accessing this data have to be scheduled in accordance with real-time
performance objectives.

• The scheduling process of transactions in a RTDB system consists of:


– Concurrency Control
– Conflict Resolution

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 168


IT 05
Introduction to Real time Database.
Scheduling RTDB Transactions

• Concurrency Control Protocols


– Locking
– Time-stamping
– Multiversion
– Validation
• all of which have the same goal; i.e., enforcing serializability.
• These Protocols need to be modified and their trade-off(s) must be
reevaluated under RTDB systems.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 169


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Concurrency Control Protocol

• Locks are used to synchronize concurrent actions


• Two-Phase Locking (2PL)
– all locking operations precedes the first unlock operation in the
transaction
– expanding phase (locks are acquired)
– shrinking phase (locks are released)
– suffers from deadlock
– priority inversion

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 170


IT 05
Introduction to Real time Database.
Scheduling RTDB Transactions
Conflict Resolution Protocol

• Conflict Resolution Protocol


– Priority-based Wound-Wait Conflict Resolution
• The original scheme was designed to use timestamps.
• It was modified so that the scheme uses priorities instead of
timestamps
• Modified scheme known as High-Priority (HP) and as
Priority-Abort (PA)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 171


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Deadlocks

• Deadlocks
– Whenever a set of transactions gets involved in a circular wait
in what is known as a wait-for graph
– Five deadlock resolution policies that take into account :
• the timing properties of the transactions
• the cost of abort operations

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 172


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Deadlocks

• Policy 1:
– Always aborts the transaction invoking deadlock detection.
• Policy 2:
– Trace the deadlock cycle
– abort the first tardy transaction encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the
furthest deadline.
• Policy 3:
– Trace the deadlock cycle
– abort the first tardy transaction encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the earliest
deadline.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 173


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Deadlocks

• Policy 4:
– Trace the deadlock cycle, and abort the first tardy transaction
encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the least
criticalness.
• Policy 5:
– Abort the infeasible transaction with the least criticalness.
– If all transactions are feasible, then abort a feasible transaction with
the least criticalness.
– This policy is sensitive to the accuracy of the computation time
because it requires information about remaining execution time
– So; Total execution time requirements at the start of each transaction
must be known.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 174


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Conflict Resolution Protocol

– Outline of the Protocol:

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 175


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions


Admission Control

• Admission Controller:
• Reject transaction
• Admit contingency action

• Scheduler:
• Drop transaction (firm/soft)
• Replace transaction with contingency action (hard)
• Postpone transaction execution (soft)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 176


IT 05
Introduction to Real time Database.

Scheduling RTDB Transactions Memory Management

• Memory management is concerned with three types of


decisions:
– transaction admission
– buffer allocation
– buffer replacement

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 177


IT 05
Introduction to Real time Database.

Future Research Areas in RTDBS


• Resource management and scheduling
• Recovery
• Concurrency Control
• Fault tolerance and security models to interact with RTDBS
• Query languages for explicit specification of real-time constraints ->
RT-SQL
• Distributed real-time databases
• Data models to support complex multimedia objects
• Schemes to process a mixture of hard, soft, and firm timing
constraints and complex transaction structures
• Support for more active features in real-time context
• Interaction with legacy systems (conventional databases)

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 178


IT 05
Short Quiz

1. Which company developed NoSQL database Apache Cassandra?


a) Microsoft
b) Twitter
c) Facebook
d) Google

2. Which of the following is not a NoSQL database?


a) SQL Server
b) MongoDB
c) Cassandra
d)None of the above

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 179


IT 05
Daily Quiz

Q1. Compare NoSQL & RDBMS


Q2. What is NoSQL?
Q3. What are the features of NoSQL?
Q4. Explain the difference between NoSQL v/s Relational database?
Q5. Explain “Polyglot Persistence” in NoSQL?
Q6. How does NoSQL DB budget memory?
Q7. How to script NoSQL DB configuration?
Q8. Does NoSQL Database Interact With Oracle Database?
Q9. What is the difference between NoSQL & Mysql DBs’?
Q10. Explain Oracle NoSQL database?

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 180


IT 05
MCQ

1. Most NoSQL databases support automatic __________ meaning that you get high
availability and disaster recovery.
(a) processing
(b) scalability
(c) replication
(d) all of the mentioned

2. Which of the following are the simplest NoSQL databases?


(a) Key-value
(b) Wide-column
(c) Document
(d) All of the mentioned

3.________ stores are used to store information about networks, such as social connections.
(a) Key-value
(b) Wide-column
(c) Document
(d) Graph
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 181
IT 05
MCQ

4. NoSQL databases is used mainly for handling large volumes of ______________ data.
(a) unstructured
(b) structured
(c) semi-structured
(d) all of the mentioned

5. Which of the following language is MongoDB written in?


(a) Javascript
(b) C
(c) C++
(d) All of the mentioned

6. Point out the correct statement.


(a) MongoDB is classified as a NoSQL database
(b) MongoDB favors XML format more than JSON
(c) MongoDB is column-oriented database store
(d) All of the mentioned
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 182
IT 05
MCQ
7. Which of the following format is supported by MongoDB?
(a) SQL
(b) XML
(c) BSON
(d) All of the mentioned
8. NoSQL was designed with security in mind, so developers or security teams don't need
to worry about implementing a security layer. Is it true or false?
(a) True
(b) False
9. Which of the following is not a reason NoSQL has become a popular solution for some
organizations?
(a) Better scalability
(b) Improved ability to keep data consistent
(c) Faster access to data than relational database management systems (RDBMS)
(d) More easily allows for data to be held across multiple servers
10. NoSQL prohibits structured query language (SQL). Is it True or False?
(a) True
(b) False
12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 183
IT 05
Glossary Questions

Fill the following blanks with one of the given options-


1(a) Horizontally scalable (b) Vertically scalable (c) Sharding (d) They don't scale
(i) SQL databases are such that
(ii)_______ means scaling by adding more machines to your pool of
resources.
(iii)________ refers to scaling by adding more power (e.g.,CPU,RAM) to
an existing machine .
(iv) __________is a method of splitting and storing a single logical
dataset in multiple databases.

2 (a) Database (b) Field (c) Document (d) Collection


(i) _______ is a type of nonrelational database that is designed to store
and query data as JSON-like documents.
(ii)_________ database (MongoDB) collects and processes analytics data
from all your websites.
(iii) A database________is a set of data values, of the same data type, in
a table.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 184


IT 05
Weekly Assignment

(iv)________ an organized collection of structured information, or data,


typically stored electronically in a computer system.

3. __________ is online NoSQL developed by Cloudera.


(a)HCatalog
(b)Hbase
(c )Imphala
(d) Oozie

4. Which of the following is not a NoSQL database?


(a) SQL Server
(b)MongoDB
(c)cassandra
(d)None of the mentioned

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 185


IT 05
Weekly/monthly/Unit Wise Assignment.

Assignment
Q1: What are NoSQL databases? What are the different types of NoSQL databases?

Q2: What do you understand by NoSQL databases? Explain.

Q3: Explain difference between scaling horizontally and vertically for databases

Q4: What are the advantages of NoSQL over traditional RDBMS?

Q5: When should we embed one document within another in MongoDB?

Q6: Define ACID Properties?

Q7: Does MongoDB support ACID transaction management and locking functionalities?

Q8: Explain advantages of BSON over JSON in MongoDB?

Q9: How can you achieve primary key - foreign key relationships in MongoDB?

Q10: How do I perform the SQL JOIN equivalent in MongoDB?


12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 186
IT 05
Sessional Paper-1

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 187
UNIT 05
Sessional Paper-1

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 188
UNIT 05
Sessional Paper-1

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 189
UNIT 05
Sessional Paper-2

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 190
UNIT 05
Sessional Paper-2

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 191
UNIT 05
Sessional Paper-2

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 192
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 193
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 194
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 195
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 196
UNIT 05
Old
Old University
University Question
Question Paper
Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 197
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 198
UNIT 05
Old University Question Paper

Mr. Rajkumar Gupta ACSAI0402 DBMS


12/27/2024 199
UNIT 05
Expected Questions for University Exam

1. What do you mean by NoSQL?


2. What are the features of NoSQL?
3. What is the CAP theorem? How is it applicable to NoSQL systems?
4. Explain the difference: RDBMS vs NoSQL?
5. What are the major challenges with traditional RDBMS?
6. What are the different types of NoSQL databases?
7. How Does NoSQL relate to big data?
8. Can you explain the transaction support by using a BASE in NoSQL?
9. What is a Key-Value store or Key-Value database?
10. What is the Column store database?

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 200


IT 05
References

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 201


IT 05
Faculty Video Links, You tube & NPTEL Video Links and Online
Courses Details

You Tube video

http://www.nptelvideos.com/lecture.php?id=6516

http://www.nptelvideos.com/lecture.php?id=6517

http://www.nptelvideos.com/lecture.php?id=6518

http://www.nptelvideos.com/lecture.php?id=6519

https://www.youtube.com/watch?v=2yQ9TGFpDuM

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 202


IT 05
Recap

 This unit provide us fundamentals domain of NOSQL and its latest


trends in industry.
 In this unit we are also benefitted with the knowledge of different
types of databases in NOSQL.
 Whether you experience a natural disaster, power failure or other
crisis, having your data stored in the cloud ensures it is backed up and
protected in a secure and safe location. Being able to access your data
again quickly allows you to conduct business as usual, minimizing any
downtime and loss of productivity
 This unit will impart us with knowledge of Cloud Databases and
querying on cloud databases.

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 203


IT 05
Thank You

12/27/2024 Mr. Rajkumar Gupta ACSAI0402 DBMS UN 204


IT 05

You might also like