0% found this document useful (0 votes)
15 views50 pages

Pdf1 Merged

Uploaded by

rakeshbishoyi28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views50 pages

Pdf1 Merged

Uploaded by

rakeshbishoyi28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

DBMS

DBMS

Database: Management
-> It is a System
collection of
similar data
types.
Module 1

DBMS:-
It is the integration of different softwares which are utilized to create, modify and control the database for the
benefits of user .

Advantages:-
• it reduces a data redudancy.
• Sharing of data among multiple users.
• It maintains data integrity.
• It maintains data consistency.
• It provides better security to data.
• It provides database backup and security methods.
Disadvantages:-
• Problems associated with centralisation.
• Problems associated with database backup and recovery.
• Cost of hardware and software are expensive.

Characteristics of DB approach:-

bit -> character -> field -> record -> file -> DB
1. Bit:-
. A bit is the smallestunit of data representation.

. Value of a bit may be 0 or 1.

.8 bits = 1 byte.
2. Character:-
. It is a collection of bits.

. To share one character in memory, computer takes 1 byte of memory space.

. 1 Character = 1 byte = 8 bits.

. A Character may be symbol or letter or blank space.


3. Field:-
. A field is a collection of characters.

. Each column in a table is called as one – one field or attribute.


4. Record:-
. The organized collection of related field is called record.

. each row is at table is known as one – one record or tuples.


5. File:-
. The organized collection of related records is called File.
6. DataBase:-
. The organized collection of related information is called database.
Ex:- a product table of related database is given below:

Product code Product Name Product Price

001 TV 25,000
002 WM 50,000
003 AC 30,000

SCHEMA
. It is a logical database description and is drawn as a chart of the types of data that are used.
. It gives the names of the entities and attributes to specify the relationship between them.
. It is a framework into which the values of data item can be treated. Ex:- the logical representation of Student
Schema.
(In SQL, don’t use space)
ex:- Student_name = String;
Student_id : int;
Student_age: int;
or studentName

. Logical Data and Physical Data:


refers to the way in which programmers see it and the physical data refers to the way in which the data are
actually recorded on the storage medium such as magnetite disc or Hard disk.

3-Schema Architecture :
i. The architecture of DBMS is divided into 3 levels -- > External level, Conceptual level, and Internal
level.
ii. The purpose of 3-Schema architecture is to seperate the user application from the physical database.
3-Schema architecture:

User1 User2 User3

External Level or Schema User view

Conceptual Level or Schema Conceptual view

Internal Level or Schema Physical view:


Handled by Database
(DB) administrator

DB DB DB

---> External level:-


An external level schema describes a part of the database in which a particular user group will be interacted
their by hiding the rest of the database from the user group.
---> Conceptual level:-
It describes the structure of whole database for a group of users and hides the details of physical storage
structure.
It describes entities, datatypes and relationshhip, data constraints and user operators.
It is defined by DBA(DataBase Administer).
---> Internal level:-
It describes the physical storage structure of database.
It contains the defination of the stored record and the method of representing the data field.

ER Diagram
. It describes a structure of database with the help of a diagram.
. ER diagram has the logical structure of database.
. ER diagram consistes the following symbols :-
rectangle -> (for Entity)

-> (for weak entity)


ellipse -> (for attribute)

-> (for key attribute)

-> (for derived attribute)

-> (for relationship)

-> (for linking attribute to entity)

-> (for multivalue attribute)

-> ( for composite attribute)

1. Entity:-
It is an object which is used to represent the real world things.
Ex: table, chair, etc.
.Entities set:-
It is a collection of similar types of entities.
Ex: “all Doctor” , “All Staffs” , etc.
Entities are divided into two types:
i. Weak entity ii. Strong Entity

i. Weak Entity:-
An entity that can’t be uniquely identified by its own attributes and depends on the relationship with
other entity is called Weak Entity.
Ex: a bank account can’t be uniquely identified without knowing the bank to which the account belongs.

ii. Strong Entity :-


An entity that can be uniqueley identified by its attributes.
Ex: a student roll no. or registration no. .
2. Attributes:-
Every entity has certain properties known as attributes.
Attributes are divided into four types:-
i. Key Attribute ii. Composite Attribute iii. Multivalue Attribute iv. Derived Attribute

i. Key :- It is an attribute which can uniquelly identified in an entity set.


Ex: Student roll number, adhar number , etc.
ii. Composite :- It is an attribute which is a combination of two or more attributes.
iii. Multivalue :- The attribute which has more than one value is called multivalue attribute.
Ex: mobile number , etc.
iv. Derived :- The attribute which is derived from another attribute is called derived attribute.
Ex: Age is derived from Date Of Birth(DOB).

3. RelationShip:-
It is an evolution between 2 or more entity. In Other word, RelationShip shows the relation between two 2
or more entities.
RelationShip is divided into 4 types i.e., :-
i. One-to-One relationship ii. One-to-Many relationship
iii. Many-to-One relationship iv. Many-to-Many relationship
i. One-to-One :- One entity is associated with another one Entity.
Ex:-

college has principal


ii. One-Many :- An entity is associated with more than one entity.
Ex:-

CUTM
has staffs
Chatrapur

iii. Many-to-One :- More than one entity is associated with one entity.
Ex:-

Student reads college

iv. Many-to-Many :- More than one entity is associated with more than one entity of another set.
Ex:-

Teachers teach Students

Data Models in DBMS

Data Model give us an idea that how the final system will look like after its complete implimentation.
-> A Data Model in DBMS, is the concept of tools that are developed to summarize the description of the
database.
-> It defines how the logical Structure of a database is modeled.
-> A Data Model is collection of conceptual tools for describing :
• Data
• Data relationships
• Data semantics and
• Consistency constraints
-> It describes the design of a database at each level of data abstraction.
-> It defines how data is connected to each other and how they are processed and stored inside the system.

Data Model Includes the following Types :


1. Relational Model 3. Network Model
2. Hierarchical Model

1. Relational Model :-
-> Most widely used model by commercial data processing applications.
-> It uses collection of tables for representing data and relationships among those data.
-> Data is stored in tables called relations.
-> Each table is a group of columns and rows, where column represents attributes of an entity and rows
represents records(or tuples).
-> Attributes or Fields :- Each column in a relation is called an attribute. The values of the attribute should
be from the same domain.
Ex:- we have different attributes of the student like Student_id, Student_name,
Student_age, etc
-> Tuples or records :- Each row in the relation called tuple. A tuple defines a collection of attribute values.
So, each row in a relation contains unique values.
Ex:- each row has all the information about any specific individual like the first has
information about student ashish.
-> This model was initially described by Edgar F.Codd, in 1969.
Diagramatically:- Relational Model in DBMS

Course Duration Type


Data Science 5 months Cohort based
Primary key Full Stack 5 months Cohort based Tuples
Software Development 6 months 1:1 (rows)

Product Management 4 months Cohort based

Attributes(columns)

2. Hierarchical Model :-
-> It was the first DBMS model
-> In hierarchical model, data is organized into tree like structure with each record is having one
parent record and many children.
-> The main drawback of this model is that, it can have only one to many relationships between nodes.
-> Hierarchical models are rarely used now.
Diagramatically:- Hierarchical model in DBMS

Electronics

Televisions Portable
Electronics

Tube LCD Plasma


MP3 CD Players 2 way
Players Radios

Flash

3. Network Model :-
-> This model is an extention of the hierarchical model. It was most popular model before the
relational model.
-> Network Model is same as hierarchical model except that it has graph-like structure rather than a
tree-based structure and are allowed more than one parent node.
-> It supports many-to-many data relationships.
-> This was the most widely used database model, before Relational Model was introduced.
Diagramatically:- Network Model in DBMS

College

CSE Library
Department

Student

Keys

• Key plays an important role in the relational database.


• It is used to uniquely identify any record or row of data from the table.
• It also used to establish and identify relationships between tables.
For Example, ID is used as a key in the Student table because it is unique for each student. In the
PERSON table, password_number, license_number, SSN are keys since they are unique for each person.

Types of Keys in DBMS(Database Management System)

There are mainly eight different types of Keys in DBMS and each key has it’s different functionality:

-> Super Key


-> Primary Key
-> Candidate Key
-> Alternate Key
-> Foreign Key
-> Compound Key
-> Composite Key
-> Surrogate Key

Let’s look at each of the keys in DBMS with example :


Super Key :- A Super key is a group of single or multiple keys which identifies rows in a table.
• Primary Key :- is a column or a group of columns in a table that uniquely identify every row in that
table.
• Candidate Key :- is a set of attributes that uniquely identify
• Alternate Key :- is a column or group of columns in a table that uniquely identify every row in that
• table.
• Foreign Key :- is a column that creates a relationship between two tables. The purpose of
Foreign keys is to maintain data integrity and allow navigation between two different
instances of an entity.

What is the Super Key?

A Super Key is a group of single or multiple keys which identifies row in a table. A Super key may have
additional attributes that are not needed for unique identification.

Example:

EmpSSN EmpNum Empname


9812345098 AB05 Shown
9876512345 AB06 Roslyn
199937890 AB07 James

In the above given example, EmpSSN and EmpNum name are superkeys.

What is a Primary Key?

Primary Key in DBMS is a column or group of columns in a table that uniquely identify every row in that table.
The Primary Key can’t be a duplicate meaning the same value can’t appear more than once in the table. A table
cannot have more than one primary key.

Rules for defining Primary key :-


• Two rows can’t have th same primary key value.
• It must for every row to have a primary key value.
• The primary key field cannot be null.
• The value in a primary key column can never be modified or updated if any foreign key refers to that
primary key.

Example :-
In the following example, StudID is a Primary key.

StudID Roll No. First Name LastName Email


1 11 Tom Price [email protected]
2 12 Nick Wright [email protected]
3 13 Dana Natan [email protected]

What is an Alternate key?


Alternate keys is a column or group of columns in a table that uniquely identify every row in that table. Atable
can have multiple choices for a primary key but oly one can be set as the primary key.All keys which are not
primary key are called an Alternate key.

Example:
In this table, StudID, Roll No, Email are qualified to become a primary key. But since StudID is the primary
key, Roll No, Email becomes the alternate key.

StudID Roll No. First Name LastName Email


1 11 Tom Price [email protected]
2 12 Nick Wright [email protected]
3 13 Dana Natan [email protected]

What is Candidate Key?

Candidate key in SQL is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key
with no repeated attributes. The Primary key should be selected from the candidate keys. Every table must have
at least a single candidate key. A table can have multiple candidate keys but only a single primary key.

Properties of Candidate key:


• It must contain unique values.
• Candidate kay in SQL may have multiple attributes.
• Must not contain null values.
• It should contain minimum fields to ensure uniqueness.
• Uniquely identify each record in a table.

Example:
In the given table StuID, Roll No, and Email are candidate keys which help us to uniquely identify the student
record in the table.

StudID Roll No. First Name LastName Email


1 11 Tom Price [email protected]
2 12 Nick Wright [email protected]
3 13 Dana Natan [email protected]
Candidate Key

StuId Roll No First Name Last Name Email


1 11 Tom Price [email protected]
2 12 Nick Wright [email protected]
3 13 Dana Natan [email protected]

Primary Key Alternate Key

What is the Foreign Key?

Foreign Key is a column that creates a relationship between two tables. The purpose of foreign keys is to
maintain data integrity and allow navigation between two different instances of an entity. It acts as a cross-
reference between two tables as it references the primary key of another table.

Example:

DeptCode DeptName
001 Science
002 English
003 Computer

Teacher ID Fname Lname


B002 David Warner
B017 Sara Joseph
B009 Mike Brunton

In thik key in DBMS example, we have two table, teach and department in a school. However, there is no way
to see which search work in which department.

In this table, adding the foreign key in Deptcode to the Teacher name, we can create a relationship between the
two tables.
Teacher ID DeptCode Fname Lname
B002 002 David Warner
B017 002 Sara Joseph
B009 001 Miker Brunton

This concept is also known as Referential Integrity.

Normalization

Normalization in DBMS is a technique using which you can organize the data in the database tables so that :
• There is less repetition of data.
• A large set of data is structured into a bunch of smaller tables.
• And the tables have a proper relationship between them.
DBMS Normalization is a systematic approach to decompose (break down) tables to eliminate data
redundancy(repetition) and undesirable characteristics like Insertion anomaly in DBMS, Update anomaly in
DBMS, and Delete anomaly in DBMS.
It is a multi-step process that puts data into tabular form, removes duplicate data, and set up the relationship
between tables.
Why we need Normalization in DBMS?
Normalization is required for,
•Eliminating redundant(useless) data, therefore handling data integrity, because if data is repeated it
increases the chances of inconsistent data.
•Normalization helps in keeping data consistent by storing the data in one table and referencing it
everywhere else.
•Storage optimization although that is not an issue these days because Database storage is cheap.
•Breaking down large tables into smaller tables with relationships, so it makes the database structure
more scalable and adaptable.
•Ensuring data dependencies make sense i.e. data is logically stored.
Problems without Normalization in DBMS :
If a table is not properly normalized and has data redundancy(repetition) then it will not only eat up extra
memory space but will also make it difficult for you to handle and update the data in the database, without
losing data.
Insertion, Updation, and Deletion Anomalies are very frequent if the database is not normalized.
To understand these anomalies let us take an example of a Student table.
rollno name branch hod office_tel
401 Akon CSE Mr. X 53337
402 Bkon CSE Mr. X 53337
403 Ckon CSE Mr. X 53337
404 Dkon CSE Mr. X 53337

In the table above, we have data for four Computer Sci. students.
As we can see, data for the fields branch, hod(Head of Department), and office_tel are repeated for the
students who are in the same branch in the college, this is Data Redundancy.
1. Insertion Anomaly in DBMS
•Suppose for a new admission, until and unless a student opts for a branch, data of the student cannot be
inserted, or else we will have to set the branch information as NULL.
•Also, if we have to insert data for 100 students of the same branch, then the branch information will be
repeated for all those 100 students.
•These scenarios are nothing but Insertion anomalies.
•If you have to repeat the same data in every row of data, it's better to keep the data separately and
reference that data in each row.
•So in the above table, we can keep the branch information separately, and just use the branch_id in the
student table, where branch_id can be used to get the branch information.
2. Updation Anomaly in DBMS
•What if Mr. X leaves the college? or Mr. X is no longer the HOD of the computer science department?
In that case, all the student records will have to be updated, and if by mistake we miss any record, it will
lead to data inconsistency.
•This is an Updation anomaly because you need to update all the records in your table just because one
piece of information got changed.
3. Deletion Anomaly in DBMS
•In our Student table, two different pieces of information are kept together, the Student information
and the Branch information.
•So if only a single student is enrolled in a branch, and that student leaves the college, or for some
reason, the entry for the student is deleted, we will lose the branch information too.
•So never in DBMS, we should keep two different entities together, which in the above example is
Student and branch,
Primary Key and Non-key attributes :
Before we move on to learn different Normal Forms in DBMS, let's first understand what is a primary key and
what are non-key attributes.
As you can see in the table above, the student_id column is a primary key because using the student_id value
we can uniquely identify each row of data, hence the remaining columns then become the non-key attributes.
Types of DBMS Normal forms
Normalization rules are divided into the following normal forms:
1.First Normal Form
2.Second Normal Form
3.Third Normal Form
4.BCNF
5.Fourth Normal Form
6.Fifth Normal Form
Let's cover all the Database Normal forms one by one with some basic examples to help you understand the
DBMS normal forms.
1. First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules:
1.It should only have single(atomic) valued attributes/columns.
2.Values stored in a column should be of the same domain.
3.All the columns in a table should have unique names.
4.And the order in which data is stored should not matter.
Let's see an example.
If we have an Employee table in which we store the employee information along with the employee skillset, the
table will look like this:
emp_id emp_name emp_mobile emp_skills

1 John Tick 9999957773 Python, JavaScript

2 Darth Trader 8888853337 HTML, CSS, JavaScript

3 Rony Shark 7777720008 Java, Linux, C++


The above table has 4 columns:
•All the columns have different names.
•All the columns hold values of the same type like emp_name has all the names, emp_mobile has all
the contact numbers, etc.
•The order in which we save data doesn't matter
•But the emp_skills column holds multiple comma-separated values, while as per the First Normal
form, each column should have a single value.
Hence the above table fails to pass the First Normal form.
So how do you fix the above table? There are two ways to do this:
1.Remove the emp_skills column from the Employee table and keep it in some other table.
2.Or add multiple rows for the employee and each row is linked with one skill.
1. Create Separate tables for Employee and Employee Skills
So the Employee table will look like this,
emp_id emp_name emp_mobile

1 John Tick 9999957773

2 Darth Trader 8888853337

3 Rony Shark 7777720008

And the new Employee_Skill table:

emp_id emp_skill

1 Python

1 JavaScript

2 HTML

2 CSS

2 JavaScript

3 Java

3 Linux

3 C++

2. Add Multiple rows for Multiple skills


You can also simply add multiple rows to add multiple skills. This will lead to repetition of the data, but that can
be handled as you further Normalize your data using the Second Normal form and the Third Normal form.

emp_id emp_name emp_mobile emp_skill

1 John Tick 9999957773 Python

1 John Tick 9999957773 JavaScript

2 Darth Trader 8888853337 HTML

2 Darth Trader 8888853337 CSS

2 Darth Trader 8888853337 JavaScript

3 Rony Shark 7777720008 Java


emp_id emp_name emp_mobile emp_skill

3 Rony Shark 7777720008 Linux

3 Rony Shark 7777720008 C++


2. Second Normal Form (2NF)
For a table to be in the Second Normal Form,
1.It should be in the First Normal form.
2.And, it should not have Partial Dependency.Let's see an example.
3.If we have an Employee table in which we store the employee information along with the employee
skillset, the table will look like this:
emp_id emp_name emp_mobile emp_skills

1 John Tick 9999957773 Python, JavaScript

HTML, CSS,
2 Darth Trader 8888853337
JavaScript

3 Rony Shark 7777720008 Java, Linux, C++

The above table has 4 columns:


All the columns have different names.
All the columns hold values of the same type like emp_name has all the names, emp_mobile has all the
contact numbers, etc.
The order in which we save data doesn't matter
But the emp_skills column holds multiple comma-separated values, while as per the First Normal form, each
column should have a single value.
Hence the above table fails to pass the First Normal form.
So how do you fix the above table? There are two ways to do this:
Remove the emp_skills column from the Employee table and keep it in some other table.
Or add multiple rows for the employee and each row is linked with one skill.
1. Create Separate tables for Employee and Employee Skills
So the Employee table will look like this,
emp_id emp_name emp_mobile

1 John Tick 9999957773

2 Darth Trader 8888853337

3 Rony Shark 7777720008


And the new Employee_Skill table:
2. Add Multiple rows for Multiple skills
3. You can also simply add multiple rows to add multiple skills. This will lead to repetition of the data, but that
can be handled as you further Normalize your data using the Second Normal form and the Third Normal
form.

Let's take an example to understand Partial dependency and the Second Normal Form.
What is Partial Dependency?
When a table has a primary key that is made up of two or more columns, then all the columns(not included in
the primary key) in that table should depend on the entire primary key and not on a part of it. If any
column(which is not in the primary key) depends on a part of the primary key then we say we have Partial
dependency in the table.
Confused? Let's take an example.
If we have two tables Students and Subjects, to store student information and information related to subjects.
Student table:
student_id student_name branch

1 Akon CSE

2 Bkon Mechanical

Subject Table:
subject_id subject_name

1 C Language

2 DSA

3 Operating System

And we have another table Score to store the marks scored by students in any subject like this,
student_id subject_id marks teacher_name

1 1 70 Miss. C

1 2 82 Mr. D

2 1 65 Mr. Op
Now in the above table, the primary key is student_id + subject_id, because both these information are
required to select any row of data.
But in the Score table, we have a column teacher_name, which depends on the subject information or just the
subject_id, so we should not keep that information in the Score table.
The column teacher_name should be in the Subjects table. And then the entire system will be Normalized as
per the Second Normal Form.
Updated Subject table:
subject_id subject_name teacher_name

1 C Language Miss. C

2 DSA Mr. D

3 Operating System Mr. Op


Updated Score table:
student_id subject_id marks

1 1 70

1 2 82

2 1 65

3. Third Normal Form (3NF)


A table is said to be in the Third Normal Form when,
1.It satisfies the First Normal Form and the Second Normal form.
2.And, it doesn't have Transitive Dependency.
What is Transitive Dependency?
In a table we have some column that acts as the primary key and other columns depends on this column. But
what if a column that is not the primary key depends on another column that is also not a primary key or part of
it? Then we have Transitive dependency in our table.
Let's take an example. We had the Score table in the Second Normal Form above. If we have to store some
extra information in it, like,
1.exam_type
2.total_marks
To store the type of exam and the total marks in the exam so that we can later calculate the percentage of marks
scored by each student.
The Score table will look like this,
student_id subject_id marks exam_type total_marks

1 1 70 Theory 100

1 2 82 Theory 100

2 1 42 Practical 50
•In the table above, the column exam_type depends on both student_id and subject_id, because a
student can be in the CSE branch or the Mechanical branch, and based on that they may have different
exam types for different subjects.
The CSE students may have both Practical and Theory for Compiler Design, whereas Mechanical
branch students may only have Theory exams for Compiler Design. But the column total_marks just
depends on the exam_type column. And the exam_type column is not a part of the primary key.
Because the primary key is student_id + subject_id, hence we have a Transitive dependency here.
How to Transitive Dependency?
You can create a separate table for ExamType and use it in the Score table.
New ExamType table,
exam_type_id exam_type total_marks duration

1 Practical 50 45

2 Theory 100 180

3 Workshop 150 300


We have created a new table ExamType and we have added more related information in it like
duration(duration of exam in mins.), and now we can use the exam_type_id in the Score table.

4. Boyce-Codd Normal Form (BCNF)


•Boyce and Codd Normal Form is a higher version of the Third Normal Form.
•This form deals with a certain type of anomaly that is not handled by 3NF.
•A 3NF table that does not have multiple overlapping candidate keys is said to be in BCNF.
•For a table to be in BCNF, the following conditions must be satisfied:
R must be in the 3rd Normal Form and, for each functional dependency ( X → Y ), X should be
a Super Key.
5. Fourth Normal Form (4NF)
A table is said to be in the Fourth Normal Form when,
1.It is in the Boyce-Codd Normal Form.
2.And, it doesn't have Multi-Valued Dependency.
Transaction Management in DBMS
A transaction is a set of logically related operations. For example, you are transferring money
from your bank account to your friend’s account, the set of operations would be like this:

Simple Transaction Example


1. Read your account balance
2. Deduct the amount from your balance
3. Write the remaining balance to your account
4. Read your friend’s account balance
5. Add the amount to his account balance
6. Write the new updated balance to his account
This whole set of operations can be called a transaction. Although I have shown you read, write
and update operations in the above example but the transaction can have operations like read,
write, insert, update, delete.

In DBMS, we write the above 6 steps transaction like this:


Lets say your account is A and your friend’s account is B, you are transferring 10000 from A
to B, the steps of the transaction are:
1. R(A);
2. A = A - 10000;
3. W(A);
4. R(B);
5. B = B + 10000;
6. W(B);
In the above transaction R refers to the Read operation and W refers to the write operation.

Transaction failure in between the operations

Now that we understand what is transaction, we should understand what are the problems
associated with it.

The main problem that can happen during a transaction is that the transaction can fail before
finishing the all the operations in the set. This can happen due to power failure, system crash
etc. This is a serious problem that can leave database in an inconsistent state. Assume that
transaction fail after third operation (see the example above) then the amount would be
deducted from your account but your friend will not receive it.

To solve this problem, we have the following two operations

Commit: If all the operations in a transaction are completed successfully then commit those
changes to the database permanently.

Rollback: If any of the operation fails then rollback all the changes done by previous
operations.
Even though these operations can help us avoiding several issues that may arise during
transaction but they are not sufficient when two transactions are running concurrently. To
handle those problems we need to understand database ACID properties.

ACID Properties in DBMS


A transaction is a single logical unit of work which accesses and possibly modifies the
contents of a database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.

Atomicity
By this, we mean that either the entire transaction takes place at once or doesn’t happen at all.
There is no midway i.e. transactions do not occur partially. Each transaction is considered as
one unit and either runs to completion or is not executed at all. It involves the following two
operations.
—Abort: If a transaction aborts, changes made to database are not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.

Consider the following transaction T consisting of T1 and T2: Transfer of 100 from
account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say,
after write(X) but before write(Y)), then amount has been deducted from X but not added
to Y. This results in an inconsistent database state. Therefore, the transaction must be executed
in entirety in order to ensure correctness of database state.

Consistency
This means that integrity constraints must be maintained so that the database is consistent
before and after the transaction. It refers to the correctness of a database. Referring to the
example above, The total amount before and after the transaction must be maintained.

Total before T occurs = 500 + 200 = 700.


Total after T occurs = 400 + 300 = 700.

Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As


a result T is incomplete.

Isolation
This property ensures that multiple transactions can occur concurrently without leading to the
inconsistency of database state. Transactions occur independently without interference.
Changes occurring in a particular transaction will not be visible to any other transaction until
that particular change in that transaction is written to memory or has been committed. This
property ensures that the execution of transactions concurrently will result in a state that is
equivalent to a state achieved these were executed serially in some order.
Let X= 500, Y = 500.

Consider two transactions T and T”.

Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of
operations takes place due to which T’’ reads correct value of X but incorrect value of Y and
sum computed by

T’’: (X+Y = 50, 000+500=50, 500)


is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).

This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take
place in isolation and changes should be visible only after they have been made to the main
memory.

Durability:
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system
failure occurs. These updates now become permanent and are stored in non-volatile memory.
The effects of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure correctness and consistency
of a database in a way such that each transaction is a group of operations that acts a single unit,
produces consistent results, acts in isolation from other operations and updates that it makes
are durably stored.

DBMS Transaction States


In this guide, we will discuss the states of a transaction in DBMS. A transaction in DBMS
can be in one of the following states.
DBMS Transaction States Diagram

Lets discuss these states one by one.


Active State
As we have discussed in the DBMS transaction introduction that a transaction is a sequence of
operations. If a transaction is in execution then it is said to be in active state. It doesn’t matter
which step is in execution, until unless the transaction is executing, it remains in active state.
Failed State
If a transaction is executing and a failure occurs, either a hardware failure or a software failure
then the transaction goes into failed state from the active state.
Partially Committed State
As we can see in the above diagram that a transaction goes into “partially committed” state
from the active state when there are read and write operations present in the transaction.
A transaction contains number of read and write operations. Once the whole transaction is
successfully executed, the transaction goes into partially committed state where we have all the
read and write operations performed on the main memory (local memory) instead of the actual
database.
The reason why we have this state is because a transaction can fail during execution so if we
are making the changes in the actual database instead of local memory, database may be left in
an inconsistent state in case of any failure. This state helps us to rollback the changes made
to the database in case of a failure during execution.
Committed State
If a transaction completes the execution successfully then all the changes made in the local
memory during partially committed state are permanently stored in the database. You can
also see in the above diagram that a transaction goes from partially committed state to
committed state when everything is successful.
Aborted State
As we have seen above, if a transaction fails during execution then the transaction goes into a
failed state. The changes made into the local memory (or buffer) are rolled back to the previous
consistent state and the transaction goes into aborted state from the failed state. Refer the
diagram to see the interaction between failed and aborted state.

Types of Schedules in DBMS


Schedule, as the name suggests, is a process of lining the transactions and executing them one
by one. When there are multiple transactions that are running in a concurrent manner and the
order of operation is needed to be set so that the operations do not overlap each other,
Scheduling is brought into play and the transactions are timed accordingly.

1. Serial Schedules:

Schedules in which the transactions are executed non-interleaved, i.e., a serial schedule
is one in which no transaction starts until a running transaction has ended are called
serial schedules. i.e., In Serial schedule, a transaction is executed completely before
starting the execution of another transaction. In other words, you can say that in serial
schedule, a transaction does not start execution until the currently running transaction
finished execution. This type of execution of transaction is also known as non-
interleaved execution. The example we have seen above is the serial schedule.

Example: Consider the following schedule involving two transactions T1 and T2.

T1 T2

R(A)

W(A)

R(B)

W(B)

R(A)

R(B)

where R(A) denotes that a read operation is performed on some data item ‘A’
This is a serial schedule since the transactions perform serially in the order T 1 —> T2

2. Non-Serial Schedule:

This is a type of Scheduling where the operations of multiple transactions are interleaved.
This might lead to a rise in the concurrency problem. The transactions are executed in a non-
serial manner, keeping the end result correct and same as the serial schedule. Unlike the
serial schedule where one transaction must wait for another to complete all its operation, in
the non-serial schedule, the other transaction proceeds without waiting for the previous
transaction to complete. This sort of schedule does not provide any benefit of the concurrent
transaction. It can be of two types namely, Serializable and Non-Serializable Schedule.

The Non-Serial Schedule can be divided further into Serializable and Non-Serializable.

Serializable:
This is used to maintain the consistency of the database. It is mainly used in the Non-Serial
scheduling to verify whether the scheduling will lead to any inconsistency or not. On the other
hand, a serial schedule does not need the serializability because it follows a transaction only
when the previous transaction is complete. The non-serial schedule is said to be in a serializable
schedule only when it is equivalent to the serial schedules, for an n number of transactions.
Since concurrency is allowed in this case thus, multiple transactions can execute concurrently.
These are of two types:

1. Conflict Serializable: A schedule is called conflict serializable if it can be transformed


into a serial schedule by swapping non-conflicting operations.
2. View Serializable: A Schedule is called view serializable if it is view equal to a serial
schedule (no overlapping transactions). A conflict schedule is a view serializable but if
the serializability contains blind writes, then the view serializable does not conflict
serializable.

Non-Serializable:
The non-serializable schedule is divided into two types, Recoverable and Non-recoverable
Schedule.

Recoverable Schedule:

Schedules in which transactions commit only after all transactions whose changes they read
commit are called recoverable schedules. In other words, if some transaction Tj is reading value
updated or written by some other transaction Ti, then the commit of Tj must occur after the
commit of Ti.

Example – Consider the following schedule involving two transactions T 1 and T2.

T1 T2

R(A)

W(A)

W(A)

R(A)

Commit

Commit

This is a recoverable schedule since T1 commits before T2, that makes the value read by
T2 correct.

3. Non-Recoverable Schedule:

Example: Consider the following schedule involving two transactions T 1 and T2.

T1 T2

R(A)

W(A)

W(A)

R(A)

Commit

Abort
4. T2 read the value of A written by T1, and committed. T1 later aborted, therefore the value
read by T2 is wrong, but since T2 committed, this schedule is non-recoverable.
SQL Commands
o SQL commands are instructions. It is used to communicate with the database.
It is also used to perform specifc tassss functionss and queries of data.

o SQL can perform various tasss lise create a tables add data to tabless drop
the tables modify the tables set permission for users.

Types of SQL Commands


There are fve types of SQL commands: DDLs DMLs DCLs TCLs and DQL.

1. Data Definition Language (DDL)


o DDL changes the structure of the table lise creating a tables deleting a tables
altering a tables etc.
o All the command of DDL are auto-committed that means it permanently save
all the changes in the database.

Here are some commands that come under DDL:

o CREATE

o ALTER

o DROP

o TRUNCATE

a. CREATE It is used to create a new table in the database.

Syntax:

CREATE TABLE TABLE_NAME

COLUMN_NAME1 DATATYPES(size)s

COLUMN_NAME2 DATATYPES(size)s

--------------

COLUMN_NAMEN DATATYPES(size)s

);

Example:

CREATE TABLE EMP

EMPNo VARCHAR2(20)s

EName VARCHAR2(20)s

Job VARCHAR2(20)s

DOB DATE

);
b. DROP : This statement is used to drop an existing database. When you use this statement,
complete information present in the database will be lost.

Syntax
DROP DATABASE DatabaseName;
Example
DROP DATABASE Employee;
The ‘DROP TABLE’ Statement
This statement is used to drop an existing table. When you use this statement, complete
information present in the table will be lost.
Syntax
DROP TABLE TableName;
Example
DROP Table Emp;
c. ALTER
This command is used to delete, modify or add constraints or columns in an existing
table.
The ‘ALTER TABLE’ Statement
This statement is used to add, delete, modify columns in an existing table.
The ‘ALTER TABLE’ Statement with ADD/DROP COLUMN
You can use the ALTER TABLE statement with ADD/DROP Column command
according to your need. If you wish to add a column, then you will use the ADD
command, and if you wish to delete a column, then you will use the DROP COLUMN
command.
Syntax
 ALTER TABLE TableName ADD ColumnName Datatype;

 ALTER TABLE TableName DROP COLUMN ColumnName;


Example
--ADD Column MobNo:
ALTER TABLE Emp ADD MobNo Number(10);
--DROP Column MobNo:
ALTER TABLE Emp DROP COLUMN MobNo ;

The ‘ALTER TABLE’ Statement with ALTER/MODIFY COLUMN


This statement is used to change the datatype of an existing column in a table.
Syntax
ALTER TABLE TableName ADD COLUMN ColumnName Datatype;
Example
--Add a column DOB and change the data type to Date.
ALTER TABLE Emp ADD DOB date;
d. TRUNCATE
This command is used to delete the information present in the table but does not delete
the table. So, once you use this command, your information will be lost, but not the
table.

Syntax:

TRUNCATE TABLE table_name;

Example:

TRUNCATE TABLE EMPLOYEE;

2. Data Manipulation Language


o DML commands are used to modify the database. It is responsible for all form
of changes in the database.

o The command of DML is not auto-committed that means it can't permanently


save all the changes in the database. They can be rollbacs.

Here are some commands that come under DML:

o INSERT

o UPDATE

o DELETE
a. INSERT: The INSERT statement is a SQL query. It is used to insert data
into the row of a table.

Syntax:

INSERT INTO TABLE_NAME

(col1s col2s col3s.... col N)

VALUES (value1s value2s value3s .... valueN);

Or

INSERT INTO TABLE_NAME

VALUES (value1s value2s value3s .... valueN);

For example:

INSERT INTO EMP(ENamesJob) VALUES ("SCOTT"s "MANAGER");

b. UPDATE: This command is used to update or modify the value of a


column in the table.

Syntax:

UPDATE table_name SET column1= values column2= values


columnN = value WHERE CONDITION;

For example:

UPDATE Emp SET Ename = 'SMITH' WHERE EmpNo = '1003';

c. DELETE: It is used to remove one or more row from a table.

Syntax1:

DELETE FROM table_name;

Syntax1

DELETE FROM table_name WHERE condition;

Example1: Delete all rows from emp table

DELETE FROM Emp;


Example2: Delete all rows from emp table whose Ename is SCOTT

DELETE FROM EName WHERE EName="SCOTT";

3. Data Control Language


DCL commands are used to grant and tase bacs authority from any database
user.

Here are some commands that come under DCL:

o Grant

o Revose

a. Grant: It is used to give user access privileges to a database.

Example

GRANT SELECTs UPDATE ON MY_TABLE TO SOME_USERs ANOTHER_USER;

b. Revoke: It is used to tase bacs permissions from the user.

Example

REVOKE SELECTs UPDATE ON MY_TABLE FROM USER1s USER2;

4. Transaction Control Language


TCL commands can only use with DML commands lise INSERTs DELETE and
UPDATE only.

These operations are automatically committed in the database that's why


they cannot be used while creating tables or dropping them.

Here are some commands that come under TCL:

o COMMIT

o ROLLBACK

o SAVEPOINT

a. Commit: Commit command is used to save all the transactions to the


database.
Syntax:

COMMIT;

Example:

DELETE FROM CUSTOMERS WHERE AGE = 25;

COMMIT;

b. Rollback: Rollbacs command is used to undo transactions that have not


already been saved to the database.

Syntax:

ROLLBACK;

Example:

DELETE FROM CUSTOMERS WHERE AGE = 25;

ROLLBACK;

c. SAVEPOINT: It is used to roll the transaction bacs to a certain point


without rolling bacs the entire transaction.

Syntax:

SAVEPOINT SAVEPOINT_NAME;

5. Data Query Language


DQL is used to fetch the data from the database.

SELECT
This statement is used to select data from a database and the data returned is stored in
a result table, called the result-set.
Syntax

SELECT Column1, Column2, ...ColumN FROM TableName;

--(*) is used to select all from the table

SELECT * FROM table_name;


-- To select the number of records to return use:

SELECT TOP 3 * FROM TableName;

Apart from just using the SELECT keyword individually, you can use the following
keywords with the SELECT statement:
o
o DISTINCT
o ORDER BY
o GROUP BY
o HAVING Clause
o INTO

The ‘SELECT DISTINCT’ Statement


This statement is used to return only different values.
Syntax

SELECT DISTINCT Column1, Column2, ...ColumnN FROM TableName;

SELECT DISTINCT MobNo FROM Emp;

Example

The ‘ORDER BY’ Statement


The ‘ORDER BY’ statement is used to sort the required results in ascending or
descending order. The results are sorted in ascending order by default. Yet, if you wish
to get the required results in descending order, you have to use the DESC keyword.
Syntax

SELECT Column1, Column2, ...ColumnN FROM TableName

ORDER BY Column1, Column2, ... ASC|DESC;

Example
Select all employees from the 'Emp’ table sorted by EmpNo:

SELECT * FROM Emp ORDER BY EmpNo;

-- Select all employees from the 'Emp table sorted by EmpNo in Descending order:

SELECT * FROM Employee_Info ORDER BY EmpNo DESC;


-- Select all employees from the 'Empl’ table sorted by EmpNo and EName:

SELECT * FROM Emp ORDER BY EmpNo, EName;

/* Select all employees from the 'Emp' table sorted bsoEmpNo in Descending order and Ename in
Ascending order: */

SELECT * FROM Emp ORDER BY EmpNo ASC, Ename DESC

The ‘GROUP BY’ Statement


This ‘GROUP BY’ statement is used with the aggregate functions to group the result-set
by one or more columns.
Syntax

SELECT Column1, Column2,..., ColumnN FROM TableName

WHERE Condition GROUP BY ColumnName(s) ORDER BY ColumnName(s);

Example
To list the number of employees from each city.

SELECT COUNT(EmpNo), City FROM Emp GROUP BY City

The ‘HAVING’ Clause


The ‘HAVING’ clause is used in SQL because the WHERE keyword cannot be used
everywhere.
Syntax

SELECT ColumnName(s) FROM TableName WHERE Condition GROUP BY


ColumnName(s) HAVING Condition ORDER BY ColumnName(s);

Example
To list the number of employees in each city. The employees should be sorted high to low and only
those cites must be included who have more than 5 employees:*/

SELECT COUNT(EmpNo), City FROM Emp GROUP BY City HAVING COUNT(EmpNo) > 2 ORDER BY
COUNT(EmpNo) DESC;

The ‘SELECT INTO’ Statement


The ‘SELECT INTO’ statement is used to copy data from one table to another.
Syntax

SELECT * INTO NewTable IN ExternalDB FROM OldTable WHERE


Condition;

Example
To create a backup of database 'Employee'

SELECT * INTO EmpNo FROM Emp;


Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of
information.

o Integrity constraints ensure that the data insertion, updating, and other
processes have to be performed in such a way that data integrity is not
affected.

o Thus, integrity constraint is used to guard against accidental damage to the


database.

Types of Integrity Constraint

1. Domain constraints
o Domain constraints can be defned as the defnition of a valid set of values
for an attribute.

o The data type of domain includes string, character, integer, time, date,
currency, etc. The value of the attribute must be available in the
corresponding domain.
Example:

2. Entity integrity constraints


o The entity integrity constraint states that primary key value can't be null.

o This is because the primary key value is used to identify individual rows in
relation and if the primary key has a null value, then we can't identify those
rows.

o A table can contain a null value other than the primary key feld.

Example:
3. Referential Integrity Constraints
o A referential integrity constraint is specifed between two tables.

o In the Referential integrity constraints, if a foreign key in Table 1 refers to the


Primary Key of Table 2, then every value of the Foreign Key in Table 1 must
be null or be available in Table 2.

Example:

4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set
uniquely.

o An entity set can have multiple keys, but out of which one key will be the
primary key. A primary key can contain a unique and null value in the
relational table.
Example:

You might also like