DBMS Series Part-1
DBMS Series Part-1
DATABASE MANAGEMENT
SYSTEM (DBMS)
-By Riti Kumari
TOPICS TO BE COVERED
1 DBMS Introduction 8 Normalisation
5 ER Model 12 SQL
6 Relational Model
7 Type of keys
DATA & INFORMATION
img+video staff+student
EXAMPLE
ID Name subject ID Name Place
Users can access databases, save data, retrieve it, update it,
and manage it safely and effectively with the use of a
software program or combination of programs.
Banks
Airlines
APPLICATION OF DBMS
Schools and Colleges - DBMS is used to create and
maintain a student information system that stores student
records, including personal details, academic performance,
attendance, and extracurricular activities.
6 Spatial Databases
7 Multimedia Databases
TYPES OF DATABASES
Relational Databases (RDBMS)- These databases
structure data into organized tables that have predefined
connections between them. Data manipulation and
querying are performed using SQL (Structured Query
Language). Well-known instances encompass MySQL,
PostgreSQL, Oracle Database, and Microsoft SQL Server.
logical level
physical level
TYPES OF LEVEL
Physical level- This is the lowest level of data abstraction.
It describes how data is actually stored in database. You
can get the complex data structure details at this level.
Logical Schema
1-Tier Architecture
2-Tier Architecture
3-Tier Architecture
DBMS ARCHITECTURE
CLIENT
Presentation layer
SERVER
Application layer
DATABASE
Data layer
DBMS ARCHITECTURE
Advantages of 3-tier-architecture
Scalability: Easily adjust each tier to handle changing user
demands.
Modularity and Maintainability: Simplify maintenance by
separating responsibilities.
Security: Protect sensitive data with an additional layer.
Performance: Optimize presentation and application tiers
for better performance.
Disadvantages of 3-tier-architecture
The disadvantages of 3-Tier Architecture include increased
complexity, potential latency issues, longer development time,
resource overhead, and the possibility of bottlenecks.
DATA MODEL
A data model within a Database Management System (DBMS) serves as an abstract
representation of how data gets structured and organized within a database.
It outlines the logical arrangement of data and the connections between various data
components.
Data models play a crucial role in comprehending and shaping databases, acting as a vital
link between real-world entities and the actual storage of data within the database.
DATA MODEL
Types of Data Model
School
department Infrastructure
Playground
Lab School
Student
TYPES OF DATA MODEL
Relational Data Model: Organizing data into tables
(known as relations) consisting of rows and columns
characterizes the relational model. It stands as the most
prevalent data model, rooted in the principles of set
theory, and relies on Structured Query Language (SQL) for
data manipulation.
ID Name Place
2 Raj KOLKATA
3 Riti MUMBAI
Column/Attribute
TYPES OF DATA MODEL
Student name
Relationship Entity Attribute
TYPES OF DATA MODEL
Object-Oriented Data Model: Extending the principles of
object-oriented programming into the database domain,
this model depicts data as objects complete with
attributes and methods, fostering support for inheritance
and encapsulation.
view 1 View
DBMS level
Introduction
Logical Data
Independence
logical level
Physical Data
Independence
physical level
ESSENTIAL COMPONENTS OF TABLES
Column/Attribute - Columns represent the attributes of the data being stored and are
named to describe the information they hold (e.g., "ID," "Name," "Age").
ID Name Place
1 Rahul DELHI
2 Raj KOLKATA
3 Riti MUMBAI
ESSENTIAL COMPONENTS OF TABLES
Constraints - Constraints define rules or conditions that must be satisfied by the data in
the table.
Common constraints include uniqueness, nullability, default values, etc.
Unique constraint: Ensures values in a column are unique across the table.
Not null constraint: Ensures a column cannot have a null value.
Check constraint: Enforces a condition to be true for each row.
Default constraint: Provides a default value for a column if no value is specified.
Keys - A primary key is a unique identifier for each record in the table. It ensures that each
row can be uniquely identified and accessed within the table.
A foreign key is a field in a table that refers to the primary key of another table. It
establishes relationships between tables.
VIEWS IN DBMS
View is a virtual table that is derived from one or more underlying tables.
This means that it doesn't physically store data but rather provides a logical
representation of data.
Customer DB
Types of keys :
Candidate Key
Primary Key
Foreign Key
Super Key
KEYS IN DBMS
Candidate Key : A candidate key refers to a group of attributes capable of uniquely
identifying a record within a table. Among these, one is selected to serve as the
primary key.
Ex- For student possible attributes for candidate key could be
20 Rahul KOLKATA
21 Raj KOLKATA
20 Riti DELHI
KEYS IN DBMS
Primary Key : A primary key is a key which uniquely identifies each record in a table.
It ensures that each tuple or record can be uniquely identified within the table.
It is always Unique+ Not null
ID Name Hometown
Student Subject
(Base/referenced table) (referencing table)
Roll Roll
Name Hometown Name subject
no no
Student Subject
(Base/referenced table) (referencing table)
Now consider there are two tables one is refrencing and other is
refrenced table .
Lets see how some operations like insert, update and delete works
here.
KEYS IN DBMS
Refrential Integrity in Foreign key
No violation
KEYS IN DBMS
Refrential Integrity in Foreign key
We using action like "CASCADE DELETE" for the same. Also we can
set null for the values deleted.
KEYS IN DBMS
Refrential Integrity in Foreign key
No violation
KEYS IN DBMS
Refrential Integrity in Foreign key
Ex-> Check for date column so that it contains valid date values
INTEGRITY CONSTRAINT IN DBMS
Key Constraint
Check Constraint
Null Constraint
Unique Constraint
Default Constraint
ATTRIBUTE RELATIONSHIP
ENTITY
Simple Attribute one to one
Strong Entity
Composite Attribute one to many
Weak Entity
Single valued Attribute many to one
Multivalued Attribute many to many
Stored Attribute
Derived Attribute
Composite Attribute
ER MODEL IN DBMS
Symbols used in ER Model
Rectangle Entity
Ellipse Attribute
Diamond Relationship
Attribute to entity
Line
relationship
Multivalued
Double ellipse
attributes
Attributes
Professor teaches Student
Entity Entity
Relationship subject
Student name
ER MODEL IN DBMS
Entity
Types of Entity
Strong Entity: A strong entity is an entity that has its own unique
identifier (primary key) and is not dependent on any other entity
for its existence within the database. Strong entities stand alone
and have their own set of attributes.
Ex-Person
Attribute
Simple Attribute
Composite Attribute
Ex- Age
ER MODEL IN DBMS
Types of Attributes
Multivalued Attribute
Stored Attribute
Derived Attribute
Complex Attribute
Relationship in ER Model
Types of Relationship
Strong Relationship
A strong relationship exists when two entities are highly dependent on each other,
and one entity cannot exist without the other.
Ex-
ER MODEL IN DBMS
Weak Relationship
A weak relationship, on the other hand, exists when two entities are related, but one
entity can exist without the other.
Ex-
ER MODEL IN DBMS
Degree in DBMS
Types of Degree
Don't Know the Answer: Every now and then, we're asked a
question, but we don't have an answer yet.
Forgot to Fill In: Like when you're filling out a form, and you
accidentally miss putting in some important information.
ER MODEL IN DBMS
1 to 1 Relationship(1:1)
1 to 1 Relationship(1:1)
Student Enrolls Course
c3 bio sumit
s3 riti 16 c3 mar
ER MODEL IN DBMS
Types of Relationship
1 to Many Relationship(1:N)
1 to Many Relationship(1:N)
b3 ef drama a1 mar
ER MODEL IN DBMS
Types of Relationship
Many to 1 Relationship(N:1)
Many to 1 Relationship(N:1)
works
Employees Department
ER MODEL IN DBMS
Types of Relationship
Enrolls
Students Courses
ER MODEL IN DBMS
Types of Relationship
Person Pan
ER MODEL IN DBMS
Book Author
ER MODEL IN DBMS
Student Course
ER MODEL IN DBMS
Participation Constraints
works
Employees Department
ER MODEL IN DBMS
Types of Participation Constraints
Partial Participation(Optional)
works
Employees Department
ER MODEL IN DBMS
Extended ER features
Why do we need?
Extended ER features
Specialization Generalization
Aggregation
ER MODEL IN DBMS
Extended ER features
Specialization
Generalization
Generalization is like finding things that are alike and putting them into a big group to
represent what they have in common. It helps make things simpler and organized.
It is a Bottom-Up approach.
Inheritance
Attribute Participation
ER MODEL IN DBMS
Extended ER features
Aggregation
1. Recognize entities.
2. Specify entity characteristics/attributes.
3. Discover connections/relationships(also contraints like
mapping/participation)
4. Define the connection type (how entities connect)/cardinality.
5. Construct an ERD (Entity-Relationship Diagram).
6. Annotate relationships and attributes.
7. Review and refine the model.
8. Document the model.
9. Validate with stakeholders.
10. Implement the database schema.
ER MODEL IN DBMS
ER Model of Instagram
Instagram is a social media platform that allows users to share photos and
videos.
ER MODEL IN DBMS
ER Model of Instagram
ER Model of Instagram
Entities
userLikes
userProfile
userFriends
userPost
userLogin
ER MODEL IN DBMS
ER Model of Instagram
Step-2 : Specify entity characteristics/attributes
Attributes
ER Model of Instagram
Step-2 : Specify entity characteristics/attributes
Attributes
ER Model of Instagram
Step-2 : Specify entity characteristics/attributes
Attributes
ER Model of Instagram
Step-2 : Specify entity characteristics/attributes
Attributes
ER Model of Instagram
Step-2 : Specify entity characteristics/attributes
Attributes
ER Model of Instagram
Step-2 : Discover connections/relationships(also contraints like
mapping/participation)
ER Model of Instagram
Step-2 : Discover connections/relationships(also contraints like
mapping/participation)
1. Table - Relation
2. Row - Tuple
3. Column - Attribute
4. Record - Each row in a table
5. Domain - The type of value an attribute can hold
6. Degree - No. of columns in a relation
7. Cardinality - No of tuples
RELATIONAL MODEL
Relational model is all about:
3. Each row must be unique, here keys comes into picture i.e candiate, super,
primary etc
CONVERT AN ER MODEL TO RELATIONAL MODEL
Converting an Entity-Relationship (ER) model to a relational model involves several
steps:
Step 1: Identify the entities - List down all the entities like strong and weak.
Multivalued attribute
Composite attribute
CONVERT AN ER MODEL TO RELATIONAL MODEL
Step 3: Key selection - Choose the primary key for each table, for some it can be
in form of composite key (Weak entity)
Step 4: If entities have relationship break it down and the reduce the tables if
possible.
1. 1-1 Relationship : 2 tables , P.K can lie on any side
2. 1-Many Relationship : 2 tables , P.K can lie on many side
3. Many -1 relationship : 2 tables , P.K can lie on many side
4. Many-Many relationship : 3 tables , P.K lie in the relation table having pk from
both the table acting as fk
CONVERT AN ER MODEL TO RELATIONAL MODEL
Step 3: Key selection - Choose the primary key for each table, for some it can be
in form of composite key (Weak entity)
LET'S LEARN
Database
LET’S START WITH SQL :)
Database
MySQL MongoDb
Oracle
MariaDB,
LET’S START WITH SQL :)
Why SQL?
I
LET’S START WITH SQL :)
SQL ( Structured Query Language)
I
LET’S START WITH SQL :)
How SQL helps us ?
SQL commands are divided into different categories based on their functionalities.
Databse- School
table1- Student (Sname, Rollno)
table2-Teacher(Tname,Tid)
School Hospital
Patient
Course Fees
LET’S START WITH SQL :)
Creation of Database
IF NOT EXISTS and IF EXISTS clauses are commonly used in conjunction with the
CREATE TABLE and DROP TABLE statements to avoid errors
LET’S START WITH SQL :)
Deletion of Database
Deleting a Database
We use the DROP DATABASE statement to delete a database.
Dropping a database means deleting the entire database, including all tables,
data, and other objects within it. DROP Is a DDL Command.
These commands are not case-sensitive.
Using a Database
We use the USE DATABASE statement to use a database
These commands are not case-sensitive.
Showing a Database
We use the SHOW DATABASES statement to see all the databses present in a
server.
Command:
Creating a table
employee
CREATE- DDL Command
empId name salary
Example:
USE instagramDb;
LET’S START WITH SQL :)
Step 3 : Create tables into the db
5. DECIMAL(p, s) - Used for exact numeric representation. p is the precision and s is the
scale.
Command :
Command :
Command :
Command :
Command :
3.BLOB (Binary Large Object)- Used for storing large amounts of binary data.(var len)
Command :
Unique constraint: Ensures values in a column are unique across the table.
Not null constraint: Ensures a column cannot have a null value.
Check constraint: Enforces a condition to be true for each row.
Default constraint: Provides a default value for a column if no value is specified.
Primary key : Enforces the uniqueness of values in one or more columns
Foreign key: Enforces a link between two tables by referencing a column in one
table that is a primary key in another table.
LET’S START WITH SQL :)
Constraints in SQL
Unique constraint:
CREATE TABLE example1 (
phoneNbr INT UNIQUE);
Check constraint:
CREATE TABLE example1 (
age INT CHECK (age >= 18));
Default constraint:
CREATE TABLE example1 (
enrolled VARCHAR(20) DEFAULT 'no' );
LET’S START WITH SQL :)
Constraints in SQL
Keys in SQL
Primary key- A primary key is a unique identifier for each record in the table.
It ensures that each row can be uniquely identified and accessed within the
table.
Foreign key-A foreign key is a field in a table that refers to the primary key of
another table. It establishes relationships between tables.
LET’S START WITH SQL :)
Primary Key : A primary key is a key which uniquely identifies each record in a table.
It ensures that each tuple or record can be uniquely identified within the table.
It is always Unique+ Not null
ID Name Hometown
Student Subject
(Base/referenced table) (referencing table)
Roll Roll
Name Hometown Name subject
no no
Student Subject
(Base/referenced table) (referencing table)
Student Course
Base/referenced/parent table Refrencing/child table
LET’S START WITH SQL :)
Foreign Key
Foreign key helps to perform operations related to the parent table, such as joining
tables or ensuring referential integrity.
Query :
These cascading actions help maintain the integrity of the data across related
tables in the database.
QUERY:
CREATE TABLE childtableName (
childId INT PRIMARY KEY,
baseId INT,
FOREIGN KEY (baseId) REFERENCES baseTableName(baseId)
ON DELETE CASCADE
);
LET’S START WITH SQL :)
Cascading in Foreign Key
QUERY :
CREATE TABLE childtableName (
childId INT PRIMARY KEY,
baseId INT,
FOREIGN KEY (baseId) REFERENCES parenttableName(childId)
ON UPDATE CASCADE
);
LET’S START WITH SQL :)
Lets make a database for all SQL commands
Let‘s make a Database for a Company
Requirements :
QUERY :
UPDATE table_name
SET columnName1= value1(to be set) , columnName2 =value2(to be set)
WHERE condition;
LET’S START WITH SQL :)
UPDATE Command (Practice Question)
1.Write a query to update the salary for all employees in the 'HR’ department to 50000.
QUERY :
UPDATE employee
SET salary = 50000
WHERE department = “HR”;
LET’S START WITH SQL :)
UPDATE Command (Practice Question)
QUERY :
UPDATE employee
SET name = “raj”
WHERE name = “raaj”;
LET’S START WITH SQL :)
DELETE Command
QUERY:
DELETE FROM table_name
WHERE condition;
LET’S START WITH SQL :)
DELETE Command (Practice Question)
1. Write a query to DELETE all records from the employee table where the department is
'HR'
QUERY :
DELETE FROM employee
WHERE department = “HR”;
LET’S START WITH SQL :)
DELETE Command (Practice Question)
QUERY :
DELETE FROM employee
WHERE name = “raj”;
LET’S START WITH SQL :)
SELECT * FROM tableName; -> to retrieve all the data present in table
LET’S START WITH SQL :)
INSERT CREATE
SELECT
UPDATE ALTER
DELETE DROP
TRUNCATE
RENAME
LET’S START WITH SQL :)
ALTER Command
Let’s see all the things ALTER can help us to do. So mostly it is used to modify
the schema, so we will mostly see how it can help in modification of columns
like - addition of new column, deletion of column, modification of column and
much more
LET’S START WITH SQL :)
ALTER Command
1. ADD a column
Query :
ALTER TABLE tableName
ADD columnName datatype constraint ;
2. Drop a column
Query :
ALTER TABLE tableName
DROP COLUMN columnName ;
LET’S START WITH SQL :)
ALTER Command
3. Modify the data type of an existing column
MODIFY clause : The MODIFY clause is oftenly used within an ALTER TABLE
statement in SQL. It allows us to change the definition or properties of an
existing column in a table.
Query :
ALTER TABLE tableName
MODIFY columnName newdatatype ;
ALTER Command
4. Change the name of an existing columng
CHANGE : The CHANGE command is oftenly used within an ALTER TABLE
statement in SQL. It helps to change the name or data type of a column
within a table.
Query :
ALTER TABLE tableName
CHANGE oldcolumnName newcolumnName newdatatype;
ALTER Command
4. Rename the name of an existing columng
RENAME COMMAND : RENAME command is used to change the name of an
existing database object, such as a table, column, index, or constraint.
Query :
ALTER TABLE tableName
RENAME COLUMN oldcolumnName TO newcolumnName ;
RENAME Command
RENAME : RENAME command is used to change the name of an existing
database object, such as a table, column, index, or constraint.
RENAME Command
Query (Column Renaming ) :
ALTER TABLE tablename
RENAME COLUMN oldcolumnname TO newcolumnname;
TRUNCATE Command
TRUNCATE command - This command removes all rows from the given
table, leaving the table empty but preserving its structure,
QUERY :
TRUNCATE TABLE tableName;
LET’S START WITH SQL :)
DISTINCT - DISTINCT keyword is used within the SELECT statement to retrieve unique
values from a column or combination of columns.
Query :
SELECT DISTINCT col1
-> retrieve a list of unique values for col1
FROM tableName;
SELECT DISTINCT col1, col2 ->return unique combinations of col1 & col2
FROM tableName;
LET’S START WITH SQL :)
Operators in SQL
Comparison Operators : equal to (=) , not equal to (<> or !=) , greater than (>)
less than (<), greater than or equal to (>=), less than or equal to (<=)
1. AND : It combines two conditions and returns true if both are true
QUERY: SELECT * FROM employee WHERE city= 'Pune' AND age > 18;
QUERY: SELECT * FROM employee WHERE city= 'Pune' OR age > 18;
3. NOT: It reverses the result of a condition, returns true if the condition is false
IS NULL / IS NOT NULL Operators : IS NULL (checks for null values) , IS NOT
NULL(checks for not null values)
LIKE & Wildcard Operators : LIKE operator is used to search for a specified
pattern in a column. It uses wildcard operators for matching patterns.
QUERY : SELECT * FROM employee WHERE salary BETWEEN 1200 AND 1500;
LET’S START WITH SQL :)
Clauses in SQL
Clauses are like tools/conditions that helps us to make queries more specific
or decide what data to fetch.
WHERE clause
LIMIT CLAUSE
LIMIT clause - The LIMIT clause in SQL is used to restrict the number of rows
returned by a query.
QUERY :
SELECT col1 , col2 FROM tableName
LIMIT noOfRows;
QUERY :
SELECT col1 , col2 FROM tableName
ORDER BY col1 (ASC/DESC), col2 (ASC/DESC)
Practice question
QUERY :
SELECT * FROM employee
WHERE id=1;
LET’S START WITH SQL :)
Practice question
Write a SQL Query to fetch the details of employees having id as 1 and city
as MUMBAI
QUERY :
SELECT * FROM employee
WHERE id=1 AND city = “MUMBAI”;
LET’S START WITH SQL :)
Practice question
Write a SQL Query to fetch the details of employees having salary greater
than 1200 and city as MUMBA a.
QUERY :
SELECT * FROM employee
WHERE salary>1200 AND city = “MUMBAI”;
LET’S START WITH SQL :)
Practice question
Write a SQL Query to fetch the details of employees who are not from
MUMBAI.
QUERY :
SELECT * FROM employee
WHERE city NOT IN ( “MUMBAI”);
LET’S START WITH SQL :)
Practice question
Write a SQL Query to fetch the details of employees having the maximum
salary.
QUERY :
SELECT * FROM employee
ORDER BY salary DESC;
LET’S START WITH SQL :)
Practice question
QUERY :
SELECT * FROM employee
ORDER BY salary DESC
LIMIT 2;
LET’S START WITH SQL :)
Aggregate Functions
Aggregate functions performs some operations on a set of rows and then returns a
single value summarizing the data. These are used with SELECT statements to
perform calculations
COUNT()
SUM()
AVG()
MIN()
MAX()
GROUP_CONCAT()
LET’S START WITH SQL :)
Aggregate Functions
COUNT() - It counts the number of rows in a table or the number of non-null values
in a column.
This counts how many things are in a list or a group.
Query : SELECT count(name) FROM employee ; -> this will tell the number of
employees in a company
LET’S START WITH SQL :)
Aggregate Functions
Query : SELECT SUM(salary) FROM employee ; -> this willl tell the total amount
company is paying to its employees
LET’S START WITH SQL :)
Aggregate Functions
Query : SELECT AVG(salary) FROM employee ; -> this willl tell the avg amount
company is paying to its employees
LET’S START WITH SQL :)
Aggregate Functions
Query : SELECT MIN(salary) FROM employee ; -> this willl tell the minimumn
salary company is paying to its employees
LET’S START WITH SQL :)
Aggregate Functions
Query : SELECT MAX(salary) FROM employee ; -> this willl tell the max salary
company is paying to its employees
LET’S START WITH SQL :)
Grouping data with the GROUP BY clause.
GROUP BY clause - This is used to group rows that have the same values into together. It
helps to organize data into groups so that you can do calculations, like finding totals or
averages, for each group
QUERY :
SELECT col1, aggregateFun(col2)
FROM tableName
GROUP BY col1 ;
QUERY :
SELECT col1, col2 aggregateFun(col3)
FROM tableName
GROUP BY col1 col2
HAVING condition;
WHERE HAVING
used to filter rows from the result based on used to filter rows from the result based on
condition applied to a row before the aggregation condition applied to a row after the aggregation
It is used with SELECT, UPDATE, or DELETE It is used with GROUP BY and aggregate
commands functions
Query :
Select city, COUNT(name) AS no_of_emp
FROM employee
GROUP BY city;
LET’S START WITH SQL :)
Practice Questions
2. Write a query to find the maximum salary of employees in each city in descending
order
Query :
Select city, max(salary) AS max_salary
FROM employee
GROUP BY city
ORDER BY DESC;
LET’S START WITH SQL :)
Practice Questions
3. Write a query to display the department names alongside the total count
of employees in each department, sorting the results by the total number of
employees in descending order.
Query :
4. Write a query to list the departments where the average salary is greater
than 1200, also display the department name and the average salary.
Query :
Joins are used to combine rows from two or more tables based on a related or
shared or common column between them. There are commonly 4 types of joins
including INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, SELF JOIN , CROSS
JOIN.
3 103 PhE
3 Ram 17
Student Course
LET’S START WITH SQL :)
Joins in SQL
1. Inner Join
rollno name rollno c_name
3 Eng
2 Rahul
A B 3 Riti 4 Maths
Student Course
LET’S START WITH SQL :)
Joins in SQL
2 Hindi
Left Right 1 Ram
table table
3 Eng
2 Rahul
4 Maths
A B 3 Riti
Student Course
LET’S START WITH SQL :)
Joins in SQL
3 Eng
2 Rahul
A B 3 Riti 4 Maths
Student Course
LET’S START WITH SQL :)
Joins in SQL
Left Right
2 Hindi
table table 1 Ram
3 Eng
2 Rahul
A B 3 Riti 4 Maths
Student Course
LET’S START WITH SQL :)
Joins in SQL
5. Self Join
rollno name
1 Ram
2 Rahul
A 3 Riti
Student
LET’S START WITH SQL :)
Joins in SQL
6. Cross Join
rollno name rollno c_name
2 Hindi
1 1 1 Ram
2 2
3 3 3 Eng
2 Rahul
3 Riti 4 Maths
A B
Student Course
LET’S START WITH SQL :)
Joins in SQL
1. Inner Join : It helps us in getting the rows that have matching values in both
tables, according to the given join condition.
102 Fruit
SELECT columns 101 Ram
FROM table1
103 Ball
INNER JOIN table2 102 Rahul
ON table1.colName = table2.colName;
103 Riti 104 Utensils
Customer Order
LET’S START WITH SQL :)
Joins in SQL
Query: It only returns rows where there is a
matching id in both tables id name id o_name
102 Fruit
SELECT * 101 Ram
FROM customer
103 Ball
INNER JOIN order 102 Rahul
ON customer.id = order.id;
103 Riti 104 Utensils
id name id o_name
customer Order
102 Rahul 102 Fruit
2. Left Join/Left Outer Join : It is used to fetch all the records from the left
table along with matched records from the right table.
If there are no matching records in the right table, NULL values are returned for
the columns of the right table.
Query:
SELECT columns
FROM table1
LEFT JOIN table2
ON table1.colName = table2.colName;
Left table : the table specified before the LEFT JOIN keyword
Right table : the table specified after the LEFT JOIN keyword
LET’S START WITH SQL :)
Joins in SQL
Query:
id name id o_name
SELECT *
FROM customer 101 Ram 102 Fruit
3. Right Join/ Right Outer Join : It is used to fetch all the records from the right
table along with matched records from the left table.
If there are no matching records in the left table, NULL values are returned for
the columns of the left table.
Query:
SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.colName = table2.colName;
Left table : the table specified before the RIGHT JOIN keyword
Right table : the table specified after the RIGHT JOIN keyword
LET’S START WITH SQL :)
Joins in SQL
Query:
FROM customer
102 Fruit
RIGHT JOIN order 101 Ram
ON customer.id = order.id;
102 Rahul 103 Ball
4. Full Join/Full Outer Join: It returns the matching rows of both left and right
table and also includes all rows from both tables even if they don’t have
matching rows.
If there is no match, NULL values are returned for the columns of the missing
table.
In MySQL, the syntax for a full join is different compared to other SQL databases like
PostgreSQL or SQL Server.
MySQL does not support the FULL JOIN keyword directly. So we use a combination of
LEFT JOIN, RIGHT JOIN, and UNION to achieve the result.
LET’S START WITH SQL :)
Joins in SQL
SELECT columns
FROM table1
LEFT JOIN table2
ON table1.colName = table2.colName;
UNION
SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.colName = table2.colName;
LET’S START WITH SQL :)
Joins in SQL
Query:
UNION
SELECT * 103 Riti 104 Utensils
FROM customer
RIGHT JOIN order
customer
ON customer.id = order.id; Order
LET’S START WITH SQL :)
Joins in SQL
id name id o_name
Result :
102 Rahul 102 Fruit
5. CrossJoin: It combines each row of the first table with every row of the
second table.
id name o_id o_name
Query:
101 Ram 1 Fruit
SELECT *
FROM table1 102 Rahul 2 Ball
Customer Order
It results in a new table where the number of rows is equal to the product of
the number of rows in each table. (m*n)
LET’S START WITH SQL :)
Joins in SQL
6. Self Join: A self join in SQL is a type of join where a table is joined with itself.
It is a type of inner join.
Query:
SELECT columns
FROM table as t1
JOIN table as t2
ON t1.colName = t2.colName
t1 and t2 are aliases for the table, used to distinguish between the order rows.
LET’S START WITH SQL :)
Joins in SQL
Ram (Mentor)
s_id name mentor_id
3 Riti 1
Query :
SELECT s1.name as mentor_name, s2.name 4 Riya 3
as name
FROM student as s1
JOIN student as s2
WHERE s1.s_id=s2.mentor_id
LET’S START WITH SQL :)
Joins in SQL
mentor_name name
Ram Rahul
Riti Riya
LET’S START WITH SQL :)
Exclusive Joins in SQL
Exclusive joins are used when we want to retrieve data from two tables excluding matched
rows. They are a part of outer joins or full outer join.
Types :
A B A B A B
LET’S START WITH SQL :)
Exclusive Joins in SQL
Left Exclusive JOIN: When we retrive records from the left table excluding the ones
matching in both left and right table .
id name id o_name
Query:
101 Ram 102 Fruit
SELECT columns
102 Rahul 103 Ball
FROM table1
LEFT JOIN table2
103 Riti 104 Utensils
ON table1.colName = table2.colName;
WHERE table2.colName IS NULL;
customer
Order
LET’S START WITH SQL :)
Exclusive Joins in SQL
Right Exclusive JOIN: When we retrive records from the right table excluding the ones
matching in both left and right table .
id name id o_name
Query:
101 Ram 102 Fruit
SELECT columns
FROM table1 102 Rahul 103 Ball
Full Exclusive JOIN: When we retrive records from the right table and left table excluding
the ones matching in both left and right table .
Query:
SELECT columns
FROM table1
LEFT JOIN table2
ON table1.colName = table2.colName;
WHERE table2.colName IS NULL;
UNION
SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.colName = table2.colName;
WHERE table1.colName IS NULL;
LET’S START WITH SQL :)
UNION Operator in SQL
UNION: UNION operator in SQL is used to combine the results of two or more SELECT
queries into a single result set and gives unique rows by removing duplicate rows.
QUERY:
id id id
SELECT columns
FROM table1 1 2 1
UNION
SELECT columns 2 3 2
FROM table2;
3 4 3
4
LET’S START WITH SQL :)
UNION ALL Operator in SQL
UNION ALL: UNION operator in SQL is used to combine the results of two or more SELECT
queries into a single result set and gives all rows by not removing duplicate rows.
QUERY:
SELECT columns
FROM table1
UNION ALL
SELECT columns
FROM table2;
LET’S START WITH SQL :)
UNION ALL Operator in SQL
id id id
1 2 1
2 3 2
3 4 3
4
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
Outer Query
Inner Query
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
QUERY:
SELECT columns, (subquery)
FROM tableName;
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
QUERY:
SELECT *
FROM tableName
WHERE column name operator (subquery);
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
QUERY:
SELECT *
FROM subquery AS altName ;
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
1. Find all the employees who have salary greater than the min salary
QUERY:
SELECT AVG(salary) FROM employee
To find all the employees having salary greater than min salary
QUERY:
SELECT name, salary
FROM employee
WHERE salary > (subquery)
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
Find employee having the min age 2 Afsara 26 'HR' 'Pune' 2000
QUERY:
SELECT MIN(age) FROM employee
QUERY:
SELECT name, age
FROM employee
WHERE age =(subquery);
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
Find employee having age> min age 2 Afsara 26 'HR' 'Pune' 2000
QUERY:
SELECT min(age) AS min_age FROM employee;
QUERY:
SELECT emp.name
FROM employee emp, (subquery) AS subquery
WHERE emp.age > subquery.min_age;
LET’S START WITH SQL :)
SQL Subqueries/Nested queries
1. Print the employees with the average age and age of employees
Print the employee age and avg_age 2 Afsara 26 'HR' 'Pune' 2000
QUERY:
SELECT AVG(age) FROM employee
QUERY:
SELECT (subquery)AS avg_age , age
FROM employee;
LET’S START WITH SQL :)
Nth Highest Salary
Step 1: Select the column which you want to show the final result i.e salary.
Step 2: Order the salary in descending order so that you have the max at the first.
Step 3: Now the value of n could 1,2,3....till n, so we have to make the query in such a
way so that whatever be the value of n it can provide the result.
Step 4: So at the end of the query we will provide a LIMIT so that on the data set
which we have got after ordering the salary in descending order, we can fetch the
nth highest one.
LET’S START WITH SQL :)
Nth Highest Salary
LIMIT- LIMIT clause is used to restrict the number of rows returned by a query.
QUERY:
SELECT DISTINCT Salary
FROM tableName
ORDER BY Salary DESC
LIMIT n-1, 1;
LET’S START WITH SQL :)
Stored Procedures
Stored Procedure- These are programs that can perform specific tasks
based on the stored query. It is basically a collection of pre-written SQL
statements grouped together under a specific name.
Query: (to create a procedure)
CREATE PROCEDURE procedureName()
BEGIN
Query
END;
Query 1:
CREATE PROCEDURE getAllOrderDetails()
BEGIN
Select * from orders;
END;
Query 1:
DELIMITER /
CREATE PROCEDURE getAllOrderDetails()
BEGIN
SELECT * FROM orders;
END/
DELIMITER ;
Examples: Return the details of the order by id (Stored procedure with params)
Query 2:
CREATE PROCEDURE getAllOrderDetailsById(IN id int)
BEGIN
SELECT *FROM Orders WHERE id = id;
END;
QUERY:
CREATE VIEW viewName AS
SELECT columns FROM baseTableName; (Specify the columns to be
included in the view)
QUERY:
SELECT * FROM viewName ;
To drop a view
QUERY:
DROP VIEW IF EXISTS viewName;
LET’S START WITH SQL :)
CASE AND IF IN SQL
CASE: It allows you to perform conditional logic within a query. It can be used in
both SELECT and UPDATE statements to evaluate conditions and return specific
values based on those conditions.
QUERY:
CASE
WHEN condition1 THEN result1 WHEN
condition2 THEN result2 ... ELSE resultN
END
LET’S START WITH SQL :)
CASE with Select statement
Q. Categorise the students on basis of their percentage to Top, Pass and fail in a new
column category
QUERY:
SELECT sid, name, percentage,
CASE
WHEN percentage > 90 THEN ’Top’
WHEN percentage BETWEEN 89 AND 34 THEN ‘Pass’
ELSE ‘Fail’
END AS category
FROM student;
LET’S START WITH SQL :)
CASE with Update statement
Q. Students have got some grace marks so update their grades. Where its A update to A+
and where its B update to A.
QUERY:
UPDATE student
SET grade = CASE
WHEN grade= = ‘B’ THEN ‘A’
WHEN grade = ‘A’ THEN ‘A+’
END;
LET’S START WITH SQL :)
IF IN SQL
IF: It is used to return one of two values depending on whether a condition is true or
false. It is not supported in may DB but supported in MySQL
QUERY:
IF(condition, value_if_true, value_if_false)
LET’S START WITH SQL :)
IF with Select statement
Q. Categorise the students on basis of their percentage to Top, Pass and fail in a new
column category
QUERY:
SELECT sid, name, percentage,
IF(percentage > 90, ’Top’ , IF(percentage BETWEEN 89 AND 34, ‘Pass’, ‘Fail’)) AS category
FROM student;
LET’S START WITH SQL :)
IF with Update statement
Q. Swap all 'f' and 'm' values (i.e., change all 'f' values to 'm' and vice versa) with a single
update statement and no intermediate temporary tables.
QUERY:
UPDATE employee
SET gender = if(gender = 'm', 'f', 'm')
LET’S START WITH SQL :)
Top SQL Questions asked in interviews
Q. Write the SQL Query to 1.create a database Company, 2.create a table employee in it
delete/drop the database
Q.Write the SQL Query to 1.create a database Company, 2.create a table employee in it,
delete/drop the database
Q. Write the SQL Query to 1.create a table employee, 2.Insert data into the table employ
3. Update Salary for all people in HR department to 20000 4. Delete data for employee
having empId =1 5. Delete the entire table
Q. Write the SQL Query to 1.create a table employee, 2.Insert data into the table employ
3. Update Salary for all people in HR department to 20000 4. Delete data for employee
having empId =1 5. Delete the entire table
Q.Write a query to find the total number of employees working in the ‘IT’ department’.
COUNT(*) is a SQL aggregate function that returns the total number of rows in a
specified table or query. It counts all the rows, regardless of whether they contain NULL
values or not.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews
Q.Write a query to find all the employees that have their name starting from ‘R’
Condition Query
Q.Write a query to calculate the total salary and average salary in a department
Q.Write a query to find the rows where a department has NULL values
Q.Write a query to find the duplicate rows in employee for column department.
Q. What is SQL?
-> SQL stands for Structured Query Language
It is a standard language used for managing and manipulating databases.
while, TRUNCATE removes all rows from a table without logging individual row
deletions and cannot be rolled back.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews
while, UNION ALL combines the results of two queries and includes all duplicates.
1.Data retrival
SELECT: Use for retrieving data from one or more tables.
2. Data Filtering
WHERE: Filter records based on specific conditions
AND, OR, NOT: Combine multiple conditions.
BETWEEN : Range search
IN: Checks whether a specified value matches any value in a subquery or a list
LIKE : Pattern matching(%, _)
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
How to think on what SQL clause/operator/function to choose?
3. Aggregation on data
AVG(): Returns the average value of a numeric column.
MIN(): Returns the minimum value in a column.
MAX(): Returns the maximum value in a column.
SUM(): Returns the total sum of a numeric column.
COUNT(): Counts the number of non-NULL values in a specified column
COUNT(*) : Counts the total number of rows in a table, including rows with
NULL values.
5. Sorting data
ORDER BY: Sorts the result set by one or more columns.
7. Conditional logic
CASE : Perform conditional logic within a query using WHEN, THEN and ELSE
IF : Return values depending on whether a condition is true or false.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Step 2: Check data types, constraints ,primary key, foreign keys, and relationships
between tables
Step 4: Ensure appropriate indexes on columns used in WHERE, JOIN, and ORDER
BY clauses.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Leetcode Questions
Leetcode Questions
Swap Salary
Q. Write a solution to swap all 'f' and 'm' values (i.e., change all 'f'
values to 'm' and vice versa) with a single update statement and
no intermediate temporary tables.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Leetcode Questions
Duplicate Emails
Q.Write a solution to report all the duplicate emails. Note that it's guaranteed that the
email field is not NULL.
Return the result table in any order.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Leetcode Questions
Employees Earning More Than Their Managers
Q. Write a solution to find the employees who earn more than their managers.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Leetcode Questions
Not Boring Movies
Q. Write a solution to report the movies with an odd-numbered ID and a description that
is not "boring".
Return the result table ordered by rating in descending order.
LET’S START WITH SQL :)
Top SQL Questions asked in interviews(Part-2)
Leetcode Questions
Classes More Than 5 Students
Write a solution to find all the classes that have at least five students.
LET’S START WITH DBMS :)
Intension and Extension in DataBase
Intension in Database :
The intension defines what kind of data can be stored and the relationships
between them. This is basically the blueprint or definition of the database
structure. It doesn't change frequently and its the permanent definition of the
database structure.
It includes:
Table definitions( name of tables, their columns, and the data types allowed in
each column)
Constraints(Rules that govern the data, such as primary keys, foreign keys, and
data validation rules)
Relationships between tables( how tables are connected through shared
columns)
LET’S START WITH DBMS :)
Intension and Extension in DataBase
Intension in Database
Example:
customer(
id INT PRIMARY KEY,
name VARCHAR(50))
LET’S START WITH DBMS :)
Intension and Extension in DataBase
Extension in Database :
The extension is the actual data stored in the database at a given instance in time.
Basically the data which is stored in tuples/rows at a given instance of time.
When there are more tuples added the data can change .
Employee
Employee Employee
id name departmen
id name departmen id name departmen
1 Rahul 'IT'
1 Rahul 'IT' 1 Rahul 'IT'
2 Afsara 'HR'
2 Afsara 'HR' 2 Afsara 'HR'
3 Abhimanyu 'IT'
3 Abhimanyu 'IT' 3 Abhimanyu 'IT'
4 Aditya 'Marketing'
4 Aditya 'Marketing'
5 Raj 'Finance'
Data at instance t1 Data at instance t2
Data at instance t3
LET’S START WITH DBMS :)
RDBMS
What is RDBMS ?
Database
LET’S START WITH DBMS :)
RDBMS
What is RDBMS ?
Database
MySQL MongoDb
Oracle
MariaDB,
LET’S START WITH DBMS :)
RDBMS
What is RDBMS ?
These databases structure data into organized tables that have predefined
connections between them.
Data manipulation and querying are performed using SQL (Structured Query
Language).
Database
LET’S START WITH DBMS :)
Normalisation and its types
Normalisation
Normalization is a process in which we organize data to reduce
redundancy(duplicacy) and improve data consistency. It involves dividing a database
into two or more tables
When there is same set of data repeated each and every time it results in
duplicacy of data (either in row or column)
Now row level duplicacy can be remove by using primary key for unique
values.
LET’S START WITH DBMS :)
Normalisation and its types
Now when we have same data for some set of columns , it leads to different
anomalies (inconsistencies or errors that occur when manipulating or
querying data in a database)
1. Insertion Anomaly
2. Updation Anomaly
3. Deletion Anomaly
Also it also increases the size of database with the same data.
LET’S START WITH DBMS :)
Normalisation and its types
data into the database due to the 2 Afsara 26 'HR' Avinash 1000
Abhimany
3 27 IT
u
4 Aditya 25 HR
5 Raj 24 HR
LET’S START WITH DBMS :)
Normalisation and its types
Types of Normalisation
Consider if you wish to find the 2 Afsara 26 HR 1 Rahul 25 'IT' Raj 1500
salary of Rahul
3 27 IT
u
3 Abhimanyu 27 'IT' Raj 1500
4 Aditya 25 HR
5
Aditya
Raj
25
24
'HR'
'HR'
Avinash
Avinash
1000
1000
IT Raj 1500
table to find the salary of Rahul HR Avinash 1000
LET’S START WITH DBMS :)
Denormalization
Benifits
Faster Queries : It can reduce the need for complex joins between tables
during queries which can eventually improve the speed of retrieving frequently
accessed data.
Simpler Queries: It can simplify queries by allowing them to be executed on a
single table instead of requiring joins across multiple tables.
Disadvantages
R
FD : X(determinant) -> Y(dependent)
X Y
1 Riti Kumari
2 Rahul Kumar
3 Suraj Singh
LET’S START WITH DBMS :)
Functional Dependecy
Trivial dependency
A functional dependency X -> Y is trivial if Y is a subset of X
We can also say it as X->X .
{EmpID, EmpFirstName} -> {EmpID}
is trivial because {EmpID} is a subset of {EmpID, EmpFirstName}.
X Y =Y
EmpID EmpFirstName EmpLastNmae
1 Riti Kumari
2 Rahul Kumar
3 Suraj Singh
LET’S START WITH DBMS :)
Functional Dependecy
Non-Trivial dependency
A functional dependency X -> Y is non-trivial if Y is not a subset of X i.e X Y=
{EmpID} -> {EmpFirstName}
is trivial because {EmpFirstName} is not a subset of {EmpID}.
X Y = empty
EmpID EmpFirstName EmpLastNmae
1 Riti Kumari
2 Rahul Kumar
3 Suraj Singh
LET’S START WITH DBMS :)
Attribute closure/closure set
Attribute closure helps us for identifying candidate keys, checking for functional
dependencies, and in normalisation.
1. Now according to the rule of Reflexivity all the attributes can determine theirself.
A->A, B->B , C->C , D->D, E->E
3. Now according to the rule of UNION if as the determinant is same we can combine
dependent
For A For C For E
A-> B , A->C, A-> D, A->E , A->A C-> D, C->E , C->C E->E
A->ABCDE C->DEC
For B For D
B-> C , B->D , B->E , B->B D-> E , D->D
B->CDEB D->ED
LET’S START WITH DBMS :)
Attribute closure
Ques : Consider we have a relation R with atrributes A,B,C,D,E and FD are
A-> B, B-> C, C-> D, D-> E
A- {A,B,C,D,E}
B- {B,C,D,E}
C-{C,D,E}
D- {D,E}
E- {E}
AB - {A,B,C,D,E}
LET’S START WITH DBMS :)
Attribute closure
5. Now lets find candidate key, super key , prime and non-prime attributes
Super key : Set of attributes whose closure contains all the attributes given in a relation
Super set of any candidate key is super key. A key(combination of all possible attributes)
which can uniquely identify two tuples.
C+ ={C}
Clsoure of A gives or determins all the attributes in the table so we can say its s
auper key.
A power set is the set of all subsets of a given set, including the empty set and
the set itself. If you have a set X, the power set of X is denoted as 2^X
Example:
We can also say to find max number of super keys we can use the formula
where 2^n-k
k- candidate key with k attributes (k < n) in a relation
n- total no of attributes in a relation
Q. Find no of candidate and super key for the given relation R (A,B,C,D) and functional
dependency A->B, B->C, C->D , B->A
First normal form is the first step in the normalisation process which helps us to
reduce data redudancy. Every table should have atomic values i.e there shouldn’t be
any multivalued attributes
Atomicity: Each column contains only indivisible (atomic) values, meaning each
attribute holds a single value.
ID PersonName Order
1 Raj Muffin,Sugar
2 Riti Muffin
3 Rahul Sugar,Egg
1. Repeat the values in id and PersonName column twice to store single value of
multivauled attribute order
ID PersonName Order
1 Raj Muffin
3 Rahul Sugar
3 Rahul Egg
LET’S START WITH DBMS :) ID PersonName Order
PK - ID
LET’S START WITH DBMS :) ID PersonName Order
3. Divide the table into student(base) and order(referencing) table based on the
multivalued attribute order.
pk fk
ID PersonName ID Order
1 Raj 1 Muffin
2 Riti 1 Sugar
3 Rahul 2 Muffin
3 Sugar
3 Egg
LET’S START WITH DBMS :)
Normalisation and its types
1 1 Muffin
Candidate key : CustomerId+OrderId
2 1 Muffin
Prime attribute :{CustomerId,OrderId}
Non-prime attribute : {OrderName} 1 2 Sugar
4 2 Sugar
1 1 Muffin
2NF
2 1 Muffin
1 2 Sugar
4 2 Sugar
1 1 1 Muffin
2 1 1 Muffin
1 2 2 Sugar
4 2 2 Sugar
LET’S START WITH DBMS :)
Normalisation and its types
Consider there is a relation R(A,B,C,D) with FD : AB->C, AB->D, B->C. Find if this is in
2NF?
Not in 2NF.
LET’S START WITH DBMS :)
Normalisation and its types
For any functional dependency X→Y, one of the following conditions must be true to be in
3rd normal form.
X is a superkey or candidate key(LHS)
Y is a prime attribute (i.e., part of some candidate key). (RHS)
LET’S START WITH DBMS :)
Normalisation and its types
Consider there is a relation R(A,B,C,D) with FD : AB->C, C->D. Find if this is in 3NF?
Not in 3NF.
LET’S START WITH DBMS :)
Normalisation and its types
Consider there is a relation R(A,B,C,D) with FD : AB->C, AB->D. Find if this is in BCNF?
It is in BCNF.
LET’S START WITH DBMS :)
Dependency Preserving decomposition
Consider a relation R(A, B, C) with FD : A->B, B->C, find if its dependency preserving when
divided into R1(AB) and R2(BC)
R1(AB) : A → B
R2(BC) : B → C
The decomposition is dependency preserving because the functional dependencies A → B
and B → C are preserved in R1 and R2
LET’S START WITH DBMS :)
Dependency Preserving decomposition
Lossy Decomposition
Lossy decomposition 4 2 6
Step 1: Lets decompose the relation based on any attribute and keep that
attribute as common, for now lets use B as common attribute
Decomposed relations: R1(A,B) and R2(B,C)
A B B C
1 2 2 3
4 2 2 6
R1 R2
LET’S START WITH DBMS :)
A B C
Lossy decomposition 4 5 6
A B C
We can see some additional
1 2 3
tuples that were not in the
1 2 6 original relation R (lossy
decomposition)
4 2 3
4 2 6
R1 natural join R2
LET’S START WITH DBMS :)
Lossy and Lossless decomposition
Lossy decomposition
How to ensure a decomposition is lossless
1. Divide or decompose the table on basis of CK or SK present in the relation
so that there is no duplicacy
2. For a decomposition to be lossless
a. R1 U R2= R
b. R1 R2 = common attribute
3. To ensure that a decomposition is lossless, a common approach is to use
the dependency preservation property
LET’S START WITH DBMS :)
Lossy and Lossless decomposition
Lossless Decomposition
Lossless Decomposition
So if the table is decomposed and we want to query the attributes present in both the
tables we will use the join operation.
Natural Join :
The natural join operation combines tuples(rows) from two relations based on
common attributes.
It only includes those combinations of tuples that have the same values for the
common attributes.
LET’S START WITH DBMS :)
A B C
Lossless decomposition 4 5 6
A B A C
1 2 1 3
4 5 4 6
R1 R2
LET’S START WITH DBMS :)
A B C
Lossless decomposition 4 5 6
A B C
1 2 3
4 5 6
R1 natural join R2
LET’S START WITH DBMS :)
Normalisation and its types
Multivalued dependency :
A multi-valued dependency X→→Y X→→Z in a relation R(X,Y,Z) implies that for each value
of X, there is a set of values for Y and a set of values for Z that are independent of each
other.
LET’S START WITH DBMS :)
Normalisation and its types
1 Sci 123
2 Hin 910
2 Eng 910
LET’S START WITH DBMS :)
StudentId Course
1 Sci 345
2 Hin 678
StudentId PhoneNbr
2 Eng 678
1 123
2 Hin 910
1 345
2 Eng 910
2 678
2 910
4NF
LET’S START WITH DBMS :)
Normalisation and its types
A B A C
1 2 1 3
4 5 4 6
R1 R2
LET’S START WITH DBMS :)
A B C
A B C
1 2 3
4 5 6
R1 natural join R2
LET’S START WITH DBMS :)
Normalisation and its types
Step 1: Identify the candidate key for the given relation using FD and
closure method.
Step 2: Find the prime and non-prime attributes.
Step 3: Start checking for normal forms one by one according to their rule
BCNF
3NF
2NF
1NF
LET’S START WITH DBMS :)
Normalisation and its types
A→BC = A is a CK
B→C = B is a CK
A→B= A is a CK
AB→C= AB is a combination of candidate keys, Its SK
B→A= B is a CK , R is in BCNF.
The highest normal form for the given relation R(A,B,C,D) is BCNF.
LET’S START WITH DBMS :)
How to normalise table
Step 2: ABCDE, Since we are assuming our relation R is in a standard relational model, it is
already in 1NF
Step 1: Decompose FDs (RHS) i.e X->AB can be written as X->A, X->B
Find Minimal Cover FD: A→BC , B→C, A→B, AB→C all the attributes of a table, if yes you can remove that, if no jump to the next
one.
Step 1 : A->B, A->C, B->C, A->B, AB->C Step 3: Remove unnecessary attributes from LHS, if the determinant is a super
key, it can be reduced to CK (minimal super key)
FD: A->B, A->C, B->C, AB->C
3. For B->C
Step 2 : FD: A->B, AB->C
1. For A->B B+={B},since B+ doesn’t have all the
FD: A->C, B->C, AB->C attributes we shouldn’t discard this
A+={A,C} since A+ doesn’t have all the attributes we
shouldn’t discard this 4. For AB->C
FD: A->B, B->C
2. For A->C
AB+={A,B,C} since AB+ have all the
FD: A->B, B->C, AB->C
attributes we can discard this
A+={A,B,C}, since A+ have all the attributes we can
discard this
LET’S START WITH DBMS :) Step 3: Ensure that every functional dependency in set2 is in set1 closure
Step 4: If both subset checks pass, then set1 and set2 are equivalent.
F={A→B,B→C} G={A→C,A→B}
Closure of F, attributes -> A,B,C Closure of G, attributes -> A,B,C
A+={A,B,C} A+={A,C,B}
B+={B,C} B+={B}
C+={C} C+={C}
Transaction is a logical unit of work that comprises one or more database operations(like
Read/write/commit/rollback) . In a transaction both read and write operations are
fundamental actions that ensure ACID properties of transactions (data consistency and
integrity)
Read(R)-> A read operation involves retrieving/fetching data from the database.
Write(W)->A write operation in a transaction involves modifying data in the database
Swiggy Order Payment page enter bank details otp network failure rollback
/issues
Concurrency control ensures that multiple transactions can run concurrently without
compromising data consistency.
Example : Consider a banking system where two transactions are happening concurrently
1. Ram giving Shyam 100rs
2. Shyam giving Ram 50rs , the data should be consistent for both transactions
ACID properties are the properties which ensures that transactions are processed reliably
and accurately, even in complex situations(sytem failures/network issues)
This guarantees that the database remains in a consistent state despite any failures or
interruptions during the transaction.
Ex :Consider Ram is transferring money to Shyam. The transaction must deduct the amount from
the Ram’s account and add it to the Shyam's account as a single operation.
If at any moment or at any part, this transaction fails (e.g., due to insufficient funds/system
error/network error), the entire transaction is rolled back, ensuring that none of the accounts is
affected partially.
LET’S START WITH DBMS :)
ACID Properties
It guarantees that the database remains in a consistent state before and after the
execution of each transaction.
Ex: Consider you had 100rs in your account but you want 50rs cash, so you
transferred 50rs to a person X and he gave you 50rs cash.
Before transaction- 100rs(in acc)
After transaction- 100rs( 50rs in acc+ 50 rs cash)
LET’S START WITH DBMS :)
ACID Properties
I-> Isolation : It ensures that if there are two transactions 1 and 2, then the changes
made by Transaction 1 are not visible to Transaction 2 until Transaction 1 commits.
While the transaction is reading data, the dbms ensures that the data is consistent and
isolated from other transactions. This means that other transactions cannot modify the
data being read by the current transaction until it is committed or rolled back.
Most DBMS use a technique called Write-Ahead Logging (WAL) to ensure durability.
Before modifying data in the database, the DBMS writes the changes to a transaction log
(often stored on disk) in a sequential manner. This ensures that if there is a failure event, the
database can recover to a consistent state.
Ex : Consider if your are transferring 100rs to your friend and there is a sudden power outage
or the system crashes right after the transaction is committed, the changes (the transfer of
100) will still be saved in the database. When the system is back up, both your account and
your friend's account will reflect the updated balances.
T1 T2
W R
W W
Isolation level : It determines the degree to which the operations in one transaction are
isolated from those in other transactions.
T2
Application
T1
DB
LET’S START WITH DBMS :)
Isolation levels and its types
T1 T2
W(A)
R(A)
LET’S START WITH DBMS :)
Isolation levels and its types
Consider if T2 modifies the data which T1 already Read and if T1 continue the transaction the
data will be changed T1 T2
R(A)
R(A)
W(A)
Commit
R(A)
LET’S START WITH DBMS :)
Isolation levels and its types
Read Uncommitted: The lowest isolation level where transactions can see uncommitted
changes made by other transactions. If Transaction T1 is writing a value to a table,
Transaction T2 can read this value before T1 commits.
Read Committed: It ensures that any data read during the transaction is committed at the
moment it is read. If T1 has done some write operation T2 can only read the data when T1 is
commited
Dirty Reads: No
Non-Repeatable Reads: Yes
Phantom Reads: Yes
LET’S START WITH DBMS :)
Isolation levels and its types
Repeatable Read: It ensures that if a transaction reads a row, it will see the same values for
that row during the entire transaction, even if other transactions modify the data and
commit. If Transaction T1 reads a value, Transaction T2 cannot modify that value until T1
completes. But T2 can insert new rows that T1 can see on subsequent reads.
Dirty Reads: No
Non-Repeatable Reads: No
Phantom Reads: Yes
LET’S START WITH DBMS :)
Isolation levels and its types
Dirty Reads: No
Non-Repeatable Reads: No
Phantom Reads: No
LET’S START WITH DBMS :)
Schedule and its Types
Schedule : It refers to the sequence in which a set of concurrent/multiple transactions are
executed. You can also say it as a sequence in which the operations (such as read, write,
commit, and abort) of multiple transactions are executed. It is really helpful to ensure data
consistency and integrity.
If there are T1, T2, T3....TN (n) transactions then the possible schedules= n! ( n factorial)
Ex : Schedule sc1 T1 T2
R(A)
R(A)
W(A)
Commit
Commit
LET’S START WITH DBMS :)
Schedule and its Types
Incomplete schedule : An incomplete schedule is one where not all transactions have
reached their final state of either commit or abort.
T1 T2
T1:Read(A) R(A)
T1:Write(A)
W(A)
T2:Read(B)
T2:Write(B) R(B)
T2:COMMIT W(B)
Commit
Here, T1 is still in progress as there is no COMMIT for transaction T1.
LET’S START WITH DBMS :)
Schedule and its Types
Complete schedule : A complete schedule is one where all the transactions in the schedule
have either committed or aborted.
T1 T2
T1:Read(A) R(A)
T1:Write(A)
W(A)
T1:COMMIT
T2:Read(B) Commit
T2:Write(B) R(B)
T2:COMMIT
W(B)
Commit
LET’S START WITH DBMS :)
Schedule and its Types
Types of Schedule
1. Serial Schedule
2. Concurrent or Non-Serial Schedule
3. Conflict-Serializable Schedule
4. View-Serializable Schedule
5. Recoverable Schedule
6. Irrecoverable Schedule
7. Cascadeless Schedule
8. Cascading Schedule
9. Strict Schedule
LET’S START WITH DBMS :)
Schedule and its Types
1.Serial Schedule : A serial schedule is one where transactions are executed one after
another. We can say it like if there are two transactions T1 and T2, T1 should commit to
completeion before T2 starts.
T1 T2
T1:Read(A) W(A)
T1:Write(A)
T1:COMMIT(T1) Commit
T2:Read(B) R(B)
T2:Write(B)
W(B)
T2:COMMIT(T2)
Commit
Challenges:
1. Since there is poor throughput(no of transactions completed per unit time) and memory
utilisation, this is not suggested as it can be can be inefficient.
2. Since wait time is high, less no of transactions are completed.
LET’S START WITH DBMS :)
Schedule and its Types
2. Non-Serial/Concurrent Schedule : A non-serial schedule is one where multiple
transactions can execute simultaneously(operations of multiple transactions are
alternate/interleaved executions). We can say it like if there are two transactions T1 and T2,
T2 doesn’t need to wait for T1 to commit, it can start at any point.
T1 T2 T3
Example : T1 ,T2, T3
R(A)
R(A)
R(B)
W(A)
COMMIT
Challenges:
1. Consistentcy issue may arise because of non-serial execution. It requires robust
concurrency control mechanisms to ensure data consistency and integrity.
2. We can use Serializability and Concurrency Control Mechanisms to ensure consistency.
LET’S START WITH DBMS :)
Schedule and its Types
3. Conflict-Serializable Schedule : A schedule is conflict-serializable if it can be
transformed into a serial schedule by swapping adjacent non-conflicting operations.
R(A)
W(A)
R(A)
W(A)
COMMIT
COMMIT
Recoverable Schedule
LET’S START WITH DBMS :)
Schedule and its Types
6. Irrecoverable Schedule : An irrecoverable schedule allows a transaction to commit even
if it has read data from another uncommitted transaction. This can lead to inconsistencies
and make it impossible to recover from certain failures.
T1 T2
R(A)
W(A)
R(A)
W(A)
COMMIT
FAIL
Irrecoverable Schedule
LET’S START WITH DBMS :)
Schedule and its Types
7. Cascading Schedule : This schedule happens when the failure or abort of one transaction
causes a series of other transactions to also abort.
T1 T2 T3
T1 Writes to A: T1 writes to data item A
T2 Reads A: T2 reads the uncommitted value of A R(A)
written by T1 W(A)
R(A)
Now, if T1 fails and aborts, T2 must also abort because it
has read an uncommitted value from T1. R(A)
Cascading Schedule
Issues:
1. Performance degradation because multiple transactions need to be rolled back
2. Improper CPU resource utilisation
LET’S START WITH DBMS :)
Schedule and its Types
8. Cascadeless Schedule : It ensures transactions only read committed data, such that the
abort of one transaction does not lead to the abort of other transactions that have already
read its uncommitted changes. T1 T2 T3
T1 Writes to A: T1 writes to data item A
T2 Reads A: T2 reads the committed value of A R(A)
Issues : R(A)
W(A)
COMMIT
Strict Schedule
LET’S START WITH DBMS :)
Concurrent VS Parallel Schedule
Multi-threading on a single-core CPU, where threads take Multi-threading on a multi-core CPU, where
Example
turns using the CPU. threads run concurrently on different cores.
LET’S START WITH DBMS :)
Serializability and its types
Serializability: It ensures that concurrent transactions yield results that are consistent with
some serial execution i.e the final state of the database after executing a set of transactions
concurrently should be the same as if the transactions had been executed one after another
in some order.
R(A) W(A) T1 T2
W(A)
Coomit
SERIAL SCHEDULE CONCURRENT SCHEDULE
A concurrent schedule does not always have a cycle.
A concurrent schedule can be conflict-serializable, meaning that it is equivalent to some serial schedule of
transactions and its conflict graph does not have any cycles.
LET’S START WITH DBMS :)
Now, since a cycle is detected we need to serialize them
T1 T2
So, we use the serializibilty here
R(A)
T1 T2 Conflict-Serializability-> to detect the cycle using
conflict graph
R(A)
View-Serializability-> to check if schedule is
W(A)
serializable after a cycle is detected.
W(A)
Now, why are we only swapping the non-conflict pairs and not the conflict ones?
So if we swap the conflict pairs, the order of exceution if it was
T1 : R(A)
T2: W(A)
the results values may change as first we were reading A and then writing/modifying it, but
now it will be writing A and then reading the modified value so the result might change if we
change the order of execution.
LET’S START WITH DBMS :)
Conflict-Serializability
T1 T2
R(X)
W(X)
S1
R(X)
R(Y)
W(Y)
R(Y)
T1 T2
Conflict-Serializability W(X)
R(X)
Q. Find a conflict equivalent for a schedule S1
R(Y)
R(X)
S1
W(X)
R(Y)
R(X)
W(Y)
R(Y)
T1 T2
Conflict-Serializability W(X)
R(Y)
2. After the first swap again search for adjacent non-conflicting W(Y)
Conflicts occur when two operations from different transactions access the same
data item and at least one of them is a write operation.
Cycle Detection: The schedule is conflict-serializable if and only if the conflict graph is
acyclic. If there are no cycles in the graph, it means that the schedule can be serialized
without violating the order of conflicting operations.
LET’S START WITH DBMS :)
Conflict-Serializability
T1 T2 T3 T1 reads A
T2 reads A
R(A)
T1 writes A
R(A) T3 writes A
W(A)
T2 writes B
T3 reads B
W(A)
W(B)
R(B)
T1 T2 T3
Conflict-Serializability R(A)
Conflict-Serializability R(A)
T3
Conflict-Serializability R(A)
Find the indegree(the number of edges directed into that node) and if its 0 it can be the first in serial
execution
T1 - 1 ,T2- 0, T3- 2 , T2 would be the first as indegree is 0
T2 must precede T1
T1 must precede T3
Therefore, one possible equivalent serial schedule is T2→T1→T3.
T1 T2
View-Serializability R(A)
T1 T2 T3 T1 T2 T3
R(A) W(A)
S W(A) R(A) S’
W(B) W(B)
LET’S START WITH DBMS :)
View-Serializability
T1 T2 T3 T1 T2 T3
W(B) R(A)
S S’
R(B) W(B)
R(A) R(B)
LET’S START WITH DBMS :)
View-Serializability
T1 T2 T3 T1 T2 T3
W(A) R(A)
S S’
R(A) W(A)
W(A) W(A)
LET’S START WITH DBMS :)
View-Serializability
The number of possible serial schedules for n transactions is given by the number of permutations of the
transactions: n!
T1 T2 T3
R(A)
W(A)
W(A)
W(A)
LET’S START WITH DBMS :) T1 T2 T3
R(A)
View-Serializability
W(A)
Step 1: Find if conflict serializable or not.
W(A)
Step 2: Find the possible serial schedules -> 3!
Step 3: Choose one possibility and check for view equivalent W(A)
conditions (T1->T2->T3)
How it helps?
1. Data Consistency: Ensures that data remains accurate and reliable despite
concurrent access.
2. Isolation: Maintains the isolation property of transactions, so the outcome of a
transaction is not affected by other concurrently executing transactions.
3. Serializability: Ensures that the result of concurrent transactions is the same as if
the transactions had been executed serially
LET’S START WITH DBMS :)
Concurrency control mechanisms
Dirty Reads: When a transaction reads data that has been modified by another
transaction but not yet committed. If the first transaction rolls back, the other
transaction will have read invalid data. (WR)
LET’S START WITH DBMS :)
Concurrency control mechanisms
Phantom Reads: Occurs when a transaction reads a set of rows that satisfy a
condition, but another transaction inserts or deletes rows that affect the set
before the first transaction completes. This results in the first transaction reading
different sets of rows if it re-executes the query.
LET’S START WITH DBMS :)
Concurrency control mechanisms
a. Binary Locks: A simple mechanism where a data item can be either locked (in use) or
unlocked.If a thread tries to acquire the lock when it's already locked, it must wait until the
lock is released by the thread currently holding it.
Shared Lock (S-lock): Allows multiple transactions to read a data item simultaneously
but prevents any of them from modifying it. Multiple transactions can hold a shared
lock on the same data item at the same time.
Exclusive Lock (X-lock): Allows a transaction to both read and modify a data item.
When an exclusive lock is held by a transaction, no other transaction can read or
modify the data item.
LET’S START WITH DBMS :)
Concurrency control mechanisms
Note : When a transaction acquires a shared lock on a data item, other transactions can also
acquire shared locks on that same item, enabling concurrent reads. However, no transaction
can acquire an exclusive lock on that item as long as one or more shared locks are held.
When a transaction acquires an exclusive lock on a data item, it has full control over that
item, meaning it can both read and modify it. No other transaction can acquire a lock on the
same data item until the exclusive lock is released.
While shared and exclusive locks are vital for maintaining data integrity and consistency in
concurrent environments, they can introduce significant challenges in terms of performance,
deadlocks, reduced concurrency, and system complexity.
LET’S START WITH DBMS :)
Concurrency control mechanisms
Drawbacks of shared-exclusive locks
Performance issues : Managing locks requires additional CPU and memory resources. The
process of acquiring, releasing, and managing locks can introduce significant overhead
Concurrency issues : Exclusive locks prevent other transactions from accessing locked
data, which can significantly reduce concurrency.
Deadlocks : Shared and exclusive locks can lead to deadlocks, where two or more
transactions hold locks that the other transactions need.
Irrecoverable : If Transaction B commits after the lock is release based on a modified value
in transaction A which fails after sometime.
LET’S START WITH DBMS :)
Concurrency control mechanisms
Deadlock : It is a situtaion when 2 or more transactions wait for one another to give up
the locks.
R1
Assigned to Waiting for
P1 P2
Two-Phase Locking (2PL) : This protocol ensures serializability by dividing the execution
of a transaction into two distinct phases
Any transaction which is following 2PL locking achieves serializability and consistency.
LET’S START WITH DBMS :)
Concurrency control mechanisms
Two-Phase Locking (2PL)
Advantages :
1. It guarantees that the schedule of transactions will be serializable, meaning the
results of executing transactions concurrently will be the same as if they were
executed in some serial order.
2. By ensuring that transactions are serializable, 2PL helps maintain data integrity and
consistency, which is critical in environments where data accuracy is essential.
Disadvantages :
1. Deadlocks, starvation and cascading rollbacks
2. Transactions must wait for locks to be released by other transactions. This can lead
to increased waiting times and lower system throughput.
3. In case of a system failure, recovering from a crash can be complex
LET’S START WITH DBMS :)
Concurrency control mechanisms
Advantages:
Prevents Cascading Aborts
Ensures Strict Serializability
Disadvantages:
Since write locks are held until the end of the transaction, other transactions may be
blocked for extended periods
Transactions may experience longer wait times to acquire locks
Deadlocks and starvation is there
LET’S START WITH DBMS :)
Concurrency control mechanisms
Advantages:
Since all locks are held until the end of the transaction, the system can easily ensure
that transactions are serializable and can be recovered
Prevents Cascading Aborts and Dirty Reads
Disadvantages:
Performance bottlenecks
Increased Transaction Duration
Deadlocks and starvation is there
LET’S START WITH DBMS :)
Concurrency control mechanisms
If the transaction is unable to acquire all the required locks (because some are already
held by other transactions), it waits and retries. The transaction only starts execution
once it has successfully acquired all the necessary locks.
Since a transaction never starts executing until it has all the locks it needs, deadlocks
cannot occur because no transaction will ever hold some locks and wait for others
In this scenario, deadlocks cannot occur because neither T1 nor T2 starts execution until
it has all the locks it needs.
LET’S START WITH DBMS :)
Concurrency control mechanisms
Write Timestamp (WTS): The last timestamp of any transaction that has
successfully written the data item.
1.Check the following condition whenever a transaction Ti issues a Read (X) operation:
If W_TS(A) >TS(Ti) then the operation is rejected. (rollback Ti)
If W_TS(A) <= TS(Ti) then the operation is executed. (set R_TS(A) as the
max of (R_TS(A), TS(Ti)
System Failure: Occurs when the entire system crashes due to hardware or
software failures, leading to loss of in-memory data.
Media Failure: Occurs when the physical storage (e.g., hard drives) is damaged,
resulting in data loss or corruption.
LET’S START WITH DBMS :)
Database recovery management
Recovery Phases
Analysis Phase: Identifies the point of failure and the transactions that were
active at that time.
Recovery Techniques
Backup and Restore: Regular backups are taken to ensure data can be
restored. Full, incremental, and differential backups are common types.