0% found this document useful (0 votes)

69 views42 pages

DBMS Unit-4 Notes

The document provides an overview of database systems, focusing on relational algebra and relational calculus, highlighting their differences and applications. It discusses functional dependencies, types of dependencies, and the importance of normalization in minimizing redundancy and ensuring data integrity. Additionally, it outlines various normal forms and their significance in database design, along with the advantages and disadvantages of normalization.

Uploaded by

mayur474645

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views42 pages

DBMS Unit-4 Notes

Uploaded by

mayur474645

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Unit-4

Introduction to Database System

Relational Algebra
Relational Algebra is a procedural language. In Relational
Algebra, order is specified in which operations have to
be performed. Basic operation in relational algebra are:
1. Select (σ)
2. Project (Π)
3. Union (U)
4. Set Difference (-)
5. Cartesian product (X)

Relational Calculus
Relational Calculus is the formal query language. It is
also known as Declarative language. In Relational
Calculus, order is not specified in which operation has to
be performed. Relational Calculus means what result we
have to obtain. Relational Calculus has two variations:
1. Tuple Relational Calculus (TRC)
2. Domain Relational Calculus (DRC)
Relational Calculus is denoted as:
{ t | P(t) } Where,
t: the set of tuples
p: is condition which is true for the given set of tuples.
Difference between Relational Algebra and Relational Calculus:

Basis of
S.NO Comparison Relational Algebra Relational Calculus

Relational Calculus is a
Language Type It is a Procedural language. Declarative (non-procedural)
1. language.

Relational Algebra means Relational Calculus means what

Procedure
2. how to obtain the result. result we have to obtain.

In Relational Algebra, the

order is specified in which the In Relational Calculus, the order
Order
operations have to be is not specified.
3. performed.

Relational Algebra is Relation Calculus can be

Domain
4. independent of the domain. domain-dependent

Programming Relational Algebra is nearer Relational Calculus is nearer to

5. language to a programming language. natural language.

Inclusion in SQL includes some features SQL is based to a greater extent

6. SQL from the relational algebra. on the tuple relational calculus.

Relational Algebra is one of

the languages in which
For a database language to be
queries can be expressed but
Relationally relationally complete, the query
the queries should also be
completeness written in it must be expressible
expressed in relational
in relational calculus.
calculus to be relationally
7. complete.

The evaluation of the query

The order of operations does not
Query relies on the order
matter in relational calculus for
Evaluation specification in which the
the evaluation of queries.
8. operations must be performed.
Basis of
S.NO Comparison Relational Algebra Relational Calculus

For accessing the database,

For accessing the database,
relational algebra provides a
relational calculus provides a
solution in terms of what is
Database access solution in terms as simple as
required and how to get that
what is required and lets the
information by following a
system find the solution for that.
9. step-by-step description.

Completeness of a language is
measured in the manner that it is
The expressiveness of any
least as powerful as calculus.
given language is judged
Expressiveness That implies relation defined
using relational algebra
using some expression of
operations as a standard.
calculus is also definable by
10. some other expression

Functional Dependency
In relational database management, functional
dependency is a concept that specifies the relationship
between two sets of attributes where one attribute
determines the value of another attribute. It is denoted
as X → Y, where attribute set on the left side of the
arrow, X is called Determinant, Y is called Dependent.
Example:
roll_no name dept_name dept_building

42 abc CO A4

43 pqr IT A3

44 xyz CO A4
From above table we can conclude some valid
functional dependencies:
 roll_no → { name, dept_name, dept_building },→

Here roll_no can determine values of name,

dept_name and dept_building, hence a valid
Functional dependency
 roll_no → dept_name , Since, roll_no can determine

whole set of {name, dept_name, dept_building}, it

can determine its subset dept_name also.
 dept_name → dept_building , Dept_name can
identify the dept_building accurately, since
departments with different dept_name will also
have a different dept_building

Here are some invalid functional dependencies:

 name → dept_name Students with the same name
can have different dept_name, hence this is not a
valid functional dependency.
 dept_building → dept_name There can be multiple
departments in the same building. Example, in the
above table departments ME and EC are in the same
building B2, hence dept_building → dept_name is an
invalid functional dependency.
Types of Functional Dependencies in DBMS
1. Trivial Functional Dependency
In this, a dependent is always a subset of determinant.
i.e. If X → Y and Y is the subset of X, then it is called
trivial functional dependency
Example:
roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, {roll_no, name} → name is a trivial functional

dependency, since the dependent name is a subset of
determinant set {roll_no, name}.

2. Non-trivial Functional Dependency

In this, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then
it is called Non-trivial functional dependency.
Here {roll_no, name} → age is a non-trivial functional
dependency, since age is not a subset of {roll_no, name}

3. Multivalued Functional Dependency

In this, entities of dependent set are not dependent on
each other. If a → {b, c} and there exists no functional
dependency between b and c then it is called
a multivalued functional dependency.
Here, roll_no → {name, age} is a multivalued functional
dependency, since the dependents name & age are not
dependent on each other.

4. Transitive Functional Dependency

In this, dependent is indirectly dependent on
determinant. i.e. If a → b & b → c, then according to
axiom of transitivity, a → c.
For example,
enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

45 abc EC 2

Here, enrol_no → dept and dept → building_no. Hence,

enrol_no → building_no is an indirect functional
dependency, called Transitive functional dependency.

5. Fully Functional Dependency

In this, an attribute or a set of attributes uniquely
determines another attribute or set of attributes. If an
attribute Q is fully functional dependent on another
attribute P, if it is Functionally Dependent on P and not
on any of the proper subset of P.
For ex. XY->Z then Z will depend completely on XY.

6. Partial Functional Dependency

In this, a non key attribute depends on a part of the
composite key, rather than the whole key. If a relation R
has attributes X, Y, Z where X and Y are the composite
key and Z is non key attribute. Then X->Z is a partial
functional dependency in RBDMS.
For ex. XY->Z then Z can either depend on X or Y
individually.

Advantages of Functional Dependencies

1. Data Normalization
Data normalization is the process of organizing data in a
database in order to minimize redundancy and increase
data integrity. Functional dependencies play an
important part in data normalization.
2. Query Optimization
With the help of functional dependencies we are able to
decide connectivity between tables and necessary
attributes need to be projected to retrieve the required
data from the tables. This helps in query optimization
and improves performance.
3. Consistency of Data
Functional dependencies ensures consistency of data by
removing any redundancies or inconsistencies that exist
in data. Functional dependency ensures changes made in
one attribute does not affect another set of attributes
thus it maintains consistency of data in database.
4. Data Quality Improvement
Functional dependencies ensure that data in database to
be accurate, complete, updated. This helps to improve
overall quality of data, eliminates errors and inaccuracies
that occur during data analysis and decision making, thus
it helps in improving the quality of data in database.

Closure of a set F of FDs is the set F+ of all FDs that can

be inferred from F

Data Modification Anomalies divided into three types:

o Insertion Anomaly: Insertion Anomaly refers to
when one cannot insert a new tuple into a
relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the

situation where the deletion of data results in the

unintended loss of some other important data.
o Updatation Anomaly: The update anomaly is when

an update of a single data value requires multiple

rows of data to be updated.
Normalization
A large database defined as a single relation may result in
data duplication. This repetition of data may result in:
o Making relations very large.

o It isn't easy to maintain and update data as it would

involve searching many records in relation.

o Wastage and poor utilization of disk space and

resources.
o The chances of errors and inconsistencies increases.

So to handle these problems, we should analyze and

decompose relations with redundant data into smaller,
simpler, and well-structured relations. Normalization can
be defined as
o Normalization is the process of organizing the data

in the database.
o Normalization is a process of decomposing the

relations into relations with fewer attributes.

o Normalization is used to minimize the redundancy

from a relation or set of relations. It is also used to

eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller

and links them using relationships.

o The normal form is used to reduce redundancy from

the database table.

Why do we need Normalization?
The main reason for normalizing the relations is
removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows.

Types of Normal Forms

Normalization works through a series of stages called
Normal forms. The normal forms apply to individual
relations. The relation is said to be in particular normal
form if it satisfies constraints.

Following are the various types of Normal forms:

Normal Description
Form

1NF A relation is in 1NF if it contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key
attributes are fully functional dependent on the
primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition
dependency exists.
BCNF A stronger definition of 3NF is known as Boyce Codd's
normal form.
4NF A relation will be in 4NF if it is in Boyce Codd's normal
form and has no multi-valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain
any join dependency, joining should be lossless.

Advantages of Normalization
o Normalization helps to minimize data redundancy.

o Greater overall database organization.

o Data consistency within the database.

o Much more flexible database design.

o Enforces the concept of relational integrity.

Disadvantages of Normalization
o You cannot start building database before knowing

what the user needs.

o The performance degrades when normalizing the

relations to higher normal forms, i.e., 4NF, 5NF.

o It is very time-consuming and difficult to normalize

relations of a higher degree.

o Careless decomposition lead to bad database design.

First Normal Form (1NF)

o A relation will be 1NF if it contains an atomic value.

o It states that attribute of a table cannot hold

multiple values. It must hold single-valued attribute.

o First normal form disallows multi-valued attribute,

composite attribute, and their combinations.

o Example: Relation EMPLOYEE is not in 1NF because

of multi-valued attribute EMP_PHONE.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238

20 Harry 8574783832 Bihar

12 Sam 7390372389, Punjab

8589830302
The decomposition of EMPLOYEE table into 1NF is below
EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

Second Normal Form (2NF)

o In the 2NF, relational must be in 1NF.

o In the second normal form, all non-key attributes are

fully functional dependent on the primary key

Example: School can store data of teachers and subjects.
In school, teacher can teach more than one subject.
TEACHER table
TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38

83 Computer 38
In the given table, non-prime attribute TEACHER_AGE is
dependent on TEACHER_ID which is a proper subset of a
candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it
into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE

25 30

47 35

83 38

TEACHER_SUBJECT table:
TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

83 Math

83 Computer

Third Normal Form (3NF)

o A relation will be in 3NF if it is in 2NF and not contain

any transitive partial dependency.

o 3NF is used to reduce the data duplication. It is also
used to achieve the data integrity.
o If there is no transitive dependency for non-prime
attributes, then the relation must be in third normal
form.
A relation is in third normal form if it holds atleast one of
the following conditions for every non-trivial function
dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part
of some candidate key.
Example: EMPLOYEE_DETAIL table:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida

333 Stephan 02228 US Boston

444 Lan 60007 US Chicago

555 Katharine 06389 UK Norwich

666 John 462007 MP Bhopal

Super key in the table above:
1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_N
AME, EMP_ZIP}....so on
Candidate key: {EMP_ID}
Non-prime attributes: In given table, all attributes
except EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on
EMP_ZIP and EMP_ZIP dependent on EMP_ID. The
non-prime attributes (EMP_STATE, EMP_CITY)
transitively dependent on super key(EMP_ID). It
violate rule of third normal form. That's why we
need to move EMP_CITY and EMP_STATE to new
<EMPLOYEE_ZIP> table with EMP_ZIPas Primary key.
EMPLOYEE table
EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010

333 Stephan 02228

444 Lan 60007

555 Katharine 06389

666 John 462007

EMPLOYEE_ZIP table
EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida

02228 US Boston

60007 US Chicago

06389 UK Norwich

462007 MP Bhopal
Boyce Codd normal form (BCNF)
o BCNF is the advance version of 3NF. It is stricter than

3NF.
o A table is in BCNF if every functional dependency X

→ Y, X is the super key of the table.

o For BCNF, the table should be in 3NF, and for every

FD, LHS is super key.

Example: Let's assume there is a company where
employees work in more than one department.
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549

In above table Functional dependencies are as follows:

1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor
EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it
into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY

264 India

EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283

Testing D394 300

Stores D283 232

Developing D283 549

EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT

D394 283

D394 300

D283 232

D283 549

Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because left side part of both the
functional dependencies is a key.

Fourth normal form (4NF)

o A relation will be in 4NF if it is in Boyce Codd normal

form and has no multi-valued dependency.

o For a dependency A → B, if for a single value of A,

multiple values of B exists, then the relation will be a

multi-valued dependency.
Example: STUDENT
STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and

HOBBY are two independent entity. Hence, there is no
relationship between COURSE and HOBBY.
In the STUDENT relation, a student with
STU_ID 21 contains two courses Computer and Math and
two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to
unnecessary repetition of data.
So to make above table into 4NF, we can decompose it
into two tables:Backward Skip 10sPlay VideoForward Skip 10s
STUDENT_COURSE
STU_ID COURSE

21 Computer

21 Math

34 Chemistry

74 Biology

59 Physics

STUDENT_HOBBY
STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey

Fifth normal form (5NF)

o A relation is in 5NF if it is in 4NF and not contains any

join dependency and joining should be lossless.

5NF is satisfied when all the tables are broken into as
o

many tables as possible in order to avoid

redundancy.
o 5NF is also known as Project-join normal form

(PJ/NF).
Example
SUBJECT LECTURER SEMESTER

Computer Anshika Semester 1

Computer John Semester 1

Math John Semester 1

Math Akash Semester 2

Chemistry Praveen Semester 1

In the above table, John takes both Computer and Math

class for Semester 1 but he doesn't take Math class for
Semester 2. In this case, combination of all these fields
required to identify a valid data.
Suppose we add a new Semester as Semester 3 but do
not know about the subject and who will be taking that
subject so we leave Lecturer and Subject as NULL. But all
three columns together acts as a primary key, so we can't
leave other two columns blank.
So to make the above table into 5NF, we can decompose
it into three relations P1, P2 & P3:
P1
SEMESTER SUBJECT

Semester 1 Computer

Semester 1 Math

Semester 1 Chemistry

Semester 2 Math

P2
SUBJECT LECTURER

Computer Anshika

Computer John

Math John

Math Akash

Chemistry Praveen

P3
SEMSTER LECTURER

Semester 1 Anshika

Semester 1 John

Semester 2 Akash

Semester 1 Praveen
SQL
SQL stands for Structured Query Language. SQL is used to
manipulate underlying relational databases that are
queried by SQL like Oracle, MySQL, SQLite, etc.

Components of SQL
1. Keywords:
Keywords are reserved or non-reserved words. SQL-
reserved keywords are INTO, UPDATE, SELECT, DELETE,
DROP, DESC, and ASC.
2. Identifiers:
The database objects, like function name, schema name,
table name, etc., are named Identifiers.
3.Clauses:
The components of queries and SQL statements such as
WHERE, GROUP BY, HAVING, and ORDER BY are formed
by clauses.
4. Expression:
Either columns or scalar values and rows of data in SQL
produced by EXPRESSION.
5. Boolean Conditions:
The boolean value TRUE or FALSE is the result of the
Conditions, also called expressions. the effect of
statements or queries limited by this condition.
5. Queries:
The data based on specific criteria is retrieved by the SQL
statements. Queries are Statements that start with the
SELECT clause because they retrieve data from the
underlying database.
6. Statements:
SQL statements may persistently affect schema and data
or control transactions, program flow, connections,
sessions, or diagnostics. SQL statements are such as
INSERT, UPDATE, DROP, and DELETE statements since
they modify the database structure or data.

Basic use of SQL:

1. It modifies the database table and index structures.
2. It adds, updates, and deletes the rows of data.
3. Subsets of information from within the relational
database management system are retrieved by it.
The information from that can be used for the
analytical application, transaction processing, and
other applications which require communication
with a relational database.

SQL Data Types

Data types are used to represent the nature of the data
that can be stored in the database table. For example, in
a particular column of a table, if we want to store a string
type of data then we will have to declare a string data
type of this column.
Data types mainly classified into three categories for
every database.
o String Data types

o Numeric Data types

o Date and time Data types

MySQL String Data Types

CHAR(Size) It is used to specify a fixed length string that can contain numbers, letters, and special
characters. Its size can be 0 to 255 characters. Default is 1.

VARCHAR(Size) It is used to specify a variable length string that can contain numbers, letters, and
special characters. Its size can be from 0 to 65535 characters.

BINARY(Size) It is equal to CHAR() but stores binary byte strings. Its size parameter specifies the
column length in the bytes. Default is 1.

VARBINARY(Size) It is equal to VARCHAR() but stores binary byte strings. Its size parameter specifies the
maximum column length in bytes.

TEXT(Size) It holds a string that can contain a maximum length of 255 characters.

TINYTEXT It holds a string with a maximum length of 255 characters.

MEDIUMTEXT It holds a string with a maximum length of 16,777,215.

LONGTEXT It holds a string with a maximum length of 4,294,967,295 characters.

ENUM(val1, val2, It is used when a string object having only one value, chosen from a list of possible
val3,...) values. It contains 65535 values in an ENUM list. If you insert a value that is not in the
list, a blank value will be inserted.

SET( It is used to specify a string that can have 0 or more values, chosen from a list of
val1,val2,val3,....) possible values. You can list up to 64 values at one time in a SET list.
MySQL Numeric Data Types
BIT(Size) It is used for a bit-value type. The number of bits per value is specified in size. Its size can be
1 to 64. The default value is 1.

INT(size) It is used for the integer value. Its signed range varies from -2147483648 to 2147483647 and
unsigned range varies from 0 to 4294967295. The size parameter specifies the max display
width that is 255.

INTEGER(size) It is equal to INT(size).

FLOAT(size, d) It is used to specify a floating point number. Its size parameter specifies the total number of
digits. The number of digits after the decimal point is specified by d parameter.

FLOAT(p) It is used to specify a floating point number. MySQL used p parameter to determine whether
to use FLOAT or DOUBLE. If p is between 0 to24, the data type becomes FLOAT (). If p is from
25 to 53, the data type becomes DOUBLE().

DOUBLE(size, It is a normal size floating point number. Its size parameter specifies the total number of
d) digits. The number of digits after the decimal is specified by d parameter.

DECIMAL(size, It is used to specify a fixed point number. Its size parameter specifies the total number of
d) digits. The number of digits after the decimal parameter is specified by d parameter. The
maximum value for the size is 65, and the default value is 10. The maximum value for d is 30,
and the default value is 0.

DEC(size, d) It is equal to DECIMAL(size, d).

BOOL It is used to specify Boolean values true and false. Zero is considered as false, and nonzero
values are considered as true.

MySQL Date and Time Data Types

DATE It is used to specify date format YYYY-MM-DD. Its supported range is from '1000-01-01' to
'9999-12-31'.

DATETIME(fsp) It is used to specify date and time combination. Its format is YYYY-MM-DD hh:mm:ss. Its
supported range is from '1000-01-01 00:00:00' to 9999-12-31 23:59:59'.

TIMESTAMP(fsp) It is used to specify the timestamp. Its value is stored as the number of seconds since the
Unix epoch('1970-01-01 00:00:00' UTC). Its format is YYYY-MM-DD hh:mm:ss. Its supported
range is from '1970-01-01 00:00:01' UTC to '2038-01-09 03:14:07' UTC.
TIME(fsp) It is used to specify the time format. Its format is hh:mm:ss. Its supported range is from '-
838:59:59' to '838:59:59'

YEAR It is used to specify a year in four-digit format. Values allowed in four digit format from
1901 to 2155, and 0000.

Basic Queries of SQL

1.INSERT INTO Statement
This SQL statement inserts the data or records in the existing table of the SQL database.
This statement can easily insert single and multiple records in a single query statement.

Syntax of insert a single record:

1. INSERT INTO table_name

2. (
3. column_name1,
4. column_name2, .…,
5. column_nameN
6. )
7. VALUES
8. (value_1,
9. value_2, ..…,
10. value_N
11. );

Example of insert a single record:

1. INSERT INTO Employee_details

2. (
3. Emp_ID,
4. First_name,
5. Last_name,
6. Salary,
7. City
8. )
9. VALUES
10. (101,
11. Akhil,
12. Sharma,
13. 40000,
14. Bangalore
15. );

This example inserts 101 in the first column, Akhil in the second column, Sharma in the
third column, 40000 in the fourth column, and Bangalore in the last column of the
table Employee_details.

2.UPDATE Statement
This SQL statement changes or modifies the stored data in the SQL database.

Syntax of UPDATE Statement:

1. UPDATE table_name
2. SET column_name1 = new_value_1, column_name2 = new_value_2, ...., column_nameN = new_va
lue_N
3. [ WHERE CONDITION ];

Example of UPDATE Statement:

1. UPDATE Employee_details
2. SET Salary = 100000
3. WHERE Emp_ID = 10;

This example changes the Salary of those employees of the Employee_details table
whose Emp_ID is 10 in the table.

3. DELETE Statement
This SQL statement deletes the stored data from the SQL database.
Syntax of DELETE Statement:

1. DELETE FROM table_name

2. [ WHERE CONDITION ];

Example of DELETE Statement:

1. DELETE FROM Employee_details

2. WHERE First_Name = 'Sumit';

This example deletes the record of those employees from the Employee_details table
whose First_Name is Sumit in the table.

This example inserts 101 in the first column, Akhil in the second column, Sharma in the
third column, 40000 in the fourth column, and Bangalore in the last column of the
table Employee_details.

4.View in SQL
A view is a SQL statement stored in the database with a name linked to it. It can store all
table rows or only a few selected rows from the table. The user can create a view in SQL
using single or multiple tables. The users create a view so that the data stored in a
specific table can be represented as virtual tables. It also enables the administrator to
restrict access to the data so that the user can only view or edit exactly the particular
element of the table they want to without changing the rest.

Creating Views
If the user wants to create a view in the database, then the user can do so by
implementing CREATE VIEW statement. The user can use a single or multiple tables to
create views. Mainly the views are created by the database administrator.

Syntax to Implement VIEW

1. CREATE VIEW view_name AS

2. SELECT column1, column2.....
3. FROM table_name
4. WHERE [condition];
We have used a single table in the above syntax, but the user can include multiple tables
in the SELECT statement using the same syntax used in any other SQL SELECT query.

Query Processing in DBMS

Query Processing is the activity performed in extracting
data from the database. In query processing, it takes
various steps for fetching the data from the database.
The steps involved are:
1. Parsing and translation
2. Optimization
3. Evaluation
1.Parsing and Translation
SQL or Structured Query Language is the best suitable
choice for humans. But, it is not perfectly suitable for the
internal representation of query to system. Relational
algebra is well suited for internal representation of a
query. The translation process in query processing is
similar to the parser of a query. When a user executes
any query, for generating internal form of the query, the
parser in system checks the syntax of query, verifies the
name of relation in the database, the tuple, and finally
the required attribute value. The parser creates a tree of
the query, known as 'parse-tree.' Further, translate it into
form of relational algebra. With this, it evenly replaces all
the use of the views when used in the query.
Thus, we can understand the working of a query
processing in the below-described diagram:

Suppose a user executes a query. In SQL, a user wants to

fetch records of employees whose salary is greater than
or equal to 10000. For this, following query is executed.
select emp_name from Employee where salary>10000;
Thus, to make the system understand the user query, it
needs to be translated in the form of relational algebra.
We can bring this query in the relational algebra form as:
o σsalary>10000 (πsalary (Employee))
o πsalary (σsalary>10000 (Employee))
After translating the given query, we can execute each
relational algebra operation by using different
algorithms. So, in this way, a query processing begins its
working.
2.Optimization
o The cost of query evaluation can vary for different

types of queries. Although the system is responsible

for constructing the evaluation plan, the user does
need not to write their query efficiently.
o Usually, a database system generates an efficient

query evaluation plan, which minimizes its cost. This

type of task performed by the database system and
is known as Query Optimization.
o For optimizing a query, the query optimizer should

have an estimated cost analysis of each operation. It

is because overall operation cost depends on the
memory allocations to several operations, execution
costs, and so on.

3.Evaluation
With addition to the relational algebra translation, it is
required to annotate the translated relational algebra
expression with the instructions used for specifying and
evaluating each operation. Thus after translating the user
query, the system executes a query evaluation plan.
Query Evaluation Plan
o In order to fully evaluate a query, the system needs
to construct a query evaluation plan.
o The annotations in evaluation plan may refer to the
algorithms to be used for particular index or the
specific operations.
o Such relational algebra with annotations is known
as Evaluation Primitives. Evaluation primitives carry
instructions needed for evaluation of operation.
o Thus, a query evaluation plan defines a sequence of
primitive operations used for evaluating a query. The
query evaluation plan is also known as query
execution plan.
o A query execution engine is responsible for
generating the output of the given query. It takes
the query execution plan, executes it, and finally
makes the output for the user query.

Finally, after selecting an evaluation plan, the system

evaluates the query and produces the output of query.

Concurrency Control
Concurrency Control is the management procedure that
is required for controlling concurrent execution of the
operations that take place on a database.
But before knowing about concurrency control, we
should know about concurrent execution.
Concurrent Execution in DBMS
o In a multi-user system, multiple users can access and

use the same database at one time, which is known

as concurrent execution of the database. It means
that the same database is executed simultaneously
on a multi-user system by different users.
o While working on database transactions, multiple

users can perform different operations and in that

case concurrent execution of database is performed.
o The thing is that the simultaneous execution that is

performed should be done in a manner that no

operation should affect the other executing
operations, thus maintaining the consistency of the
database. Thus, on making the concurrent execution
of the transaction operations, there occur several
challenging problems that need to be solved.

Problems with Concurrent Execution

In a database transaction, the two main operations
are READ and WRITE operations. So, there is a need to
manage these two operations in concurrent execution of
the transactions as if these operations are not performed
in an interleaved manner, the data may become
inconsistent. So, the following problems occur with the
Concurrent Execution of the operations:

Problem 1: Lost Update Problems (W - W Conflict)

The problem occurs when two different database
transactions perform the read/write operations on the
same database items in an interleaved manner (i.e.,
concurrent execution) that makes the values of the items
incorrect hence making the database inconsistent.
For example:
Consider the below diagram where two transactions
TX and TY, are performed on the same account A where
the balance of account A is $300.
o At time t1, transaction TX reads the value of account
A, i.e., $300 (only read).
o At time t2, transaction TX deducts $50 from account

A that becomes $250 (only deducted and not

updated/write).
o Alternately, at time t3, transaction TY reads the value

of account A that will be $300 only because TX didn't

update the value yet.
o At time t4, transaction TY adds $100 to account A

that becomes $400 (only added but not

updated/write).
o At time t6, transaction TX writes the value of account

A that will be updated as $250 only, as TY didn't

update the value yet.
o Similarly, at time t7, transaction TY writes the values

of account A, so it will write as done at time t4 that

will be $400. It means the value written by TX is lost,
i.e., $250 is lost.
Hence data becomes incorrect, and database sets to
inconsistent.

2.Dirty Read Problems (W-R Conflict)

The dirty read problem occurs when one transaction
updates an item of the database, and somehow the
transaction fails, and before the data gets rollback, the
updated database item is accessed by another
transaction. There comes the Read-Write Conflict
between both transactions.
For example:
Consider two transactions TX and TY in the below
diagram performing read/write operations on account
A where the available balance in account A is $300:

o At time t1, transaction TX reads the value of account

A, i.e., $300.
o At time t2, transaction TX adds $50 to account A that
becomes $350.
o At time t3, transaction TX writes the updated value in
account A, i.e., $350.
o Then at time t4, transaction TY reads account A that
will be read as $350.
o Then at time t5, transaction TX rollbacks due to
server problem, and the value changes back to $300
(as initially).
o But the value for account A remains $350 for

transaction TY as committed, which is the dirty read

and therefore known as the Dirty Read Problem.
Thus, in order to maintain consistency in the database
and avoid such problems that take place in concurrent
execution, management is needed, and that is where the
concept of Concurrency Control comes into role.

Concurrency Control
Concurrency Control is required for controlling and
managing the concurrent execution of database
operations and thus avoiding the inconsistencies in the
database. Thus, for maintaining the concurrency of the
database, we have the concurrency control protocols.

Concurrency Control Protocols

The concurrency control protocols ensure the atomicity,
consistency, isolation, durability and serializability of the
concurrent execution of the database transactions.
Therefore, these protocols are categorized as:
o Lock Based Concurrency Control Protocol
o Time Stamp Concurrency Control Protocol
o Validation Based Concurrency Control Protocol

1. Lock-Based Protocol
In this type of protocol, any transaction cannot read
or write data until it acquires an appropriate lock on
it. There are two types of lock:

a) Shared lock:
 It is also known as a Read-only lock. In a shared
lock, data item can be only read by transaction.
 It can be shared between the transactions
because when the transaction holds a lock, then
it can't update the data on the data item.
b) Exclusive lock:
 In the exclusive lock, the data item can be both
reads as well as written by the transaction.
 This lock is exclusive and in this lock, multiple
transactions do not modify the same data
simultaneously.

2.Timestamp Ordering Protocol

o The Timestamp Ordering Protocol is used to order

the transactions based on their Timestamps. The

order of transaction is ascending order of the
transaction creation.
o The priority of older transaction is higher that's why

it executes first. To determine timestamp of the

transaction, this protocol uses system time or logical
counter.
o The lock-based protocol is used to manage the order
between conflicting pairs among transactions at the
execution time. But Timestamp based protocols start
working as soon as a transaction is created.
o Let's assume there are two transactions T1 and T2.
Suppose the transaction T1 has entered the system
at 007 times and transaction T2 has entered the
system at 009 times. T1 has the higher priority, so it
executes first as it is entered the system first.
o The timestamp ordering protocol also maintains the
timestamp of last 'read' and 'write' operation on a
data.

Advantages of TO protocol
o This protocol ensures serializability since the
precedence graph is as follows:
o This protocol ensures freedom from deadlock that
means no transaction ever waits.

3.Validation Based Protocol

Validation phase is also known as optimistic concurrency
control technique. In the validation based protocol, the
transaction is executed in the following three phases:
1. Read phase: In this phase, transaction T is read and
executed. It is used to read the value of various data
items and stores them in temporary local variables.
It can perform all the write operations on temporary
variables without an update to the actual database.
2. Validation phase: In this phase, the temporary
variable value will be validated against the actual
data to see if it violates the serializability.
3. Write phase: If validation of transaction is validated,
then temporary results are written to the database
or system, otherwise the transaction is rolled back.

Here each phase has the following different timestamps:

Start(Ti): It contains time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its
read phase and starts its validation phase.
Finish(Ti): It contains time when Ti finishes write phase.
o The serializability is determined during the validation
process. It can't be decided in advance.
o While executing transaction, it ensures a greater
degree of concurrency and also less number of
conflicts.
o Thus it contains transactions which have less
number of rollbacks.

Striver SDE Sheet (Core) Most Asked Interview Questions
100% (1)
Striver SDE Sheet (Core) Most Asked Interview Questions
40 pages
Chapter 5 Database Management
No ratings yet
Chapter 5 Database Management
35 pages
UNIT-3 DBMS Notes
No ratings yet
UNIT-3 DBMS Notes
54 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
104 pages
Unit-3 DBMS
No ratings yet
Unit-3 DBMS
63 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
72 pages
Module5 Relational Database Design
No ratings yet
Module5 Relational Database Design
19 pages
Unit 3
No ratings yet
Unit 3
19 pages
Dbms Notes Unit 4
No ratings yet
Dbms Notes Unit 4
25 pages
Normalization 1
No ratings yet
Normalization 1
25 pages
Relational Model (Unit-2)
No ratings yet
Relational Model (Unit-2)
6 pages
RDBMS Unit-2
No ratings yet
RDBMS Unit-2
79 pages
Normalization Unit 3
No ratings yet
Normalization Unit 3
30 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
88 pages
Unit 3
No ratings yet
Unit 3
33 pages
Unit 3 Lecture 1
No ratings yet
Unit 3 Lecture 1
21 pages
Relational Model (Unit-2)
No ratings yet
Relational Model (Unit-2)
12 pages
RDBMS Unit 2
No ratings yet
RDBMS Unit 2
19 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
68 pages
NORMALIZATION
No ratings yet
NORMALIZATION
4 pages
Unit-3 Normalization in Data Base
No ratings yet
Unit-3 Normalization in Data Base
109 pages
Unit 3
No ratings yet
Unit 3
19 pages
RDBMS Unit-2
No ratings yet
RDBMS Unit-2
18 pages
Chapter 4
No ratings yet
Chapter 4
16 pages
Adbm Unit I
No ratings yet
Adbm Unit I
43 pages
5 Basics of Functional Dependencies A Relational
No ratings yet
5 Basics of Functional Dependencies A Relational
21 pages
DBMS UNIT 4 and 5
No ratings yet
DBMS UNIT 4 and 5
57 pages
Unit 3
No ratings yet
Unit 3
162 pages
DBMS Unit4
No ratings yet
DBMS Unit4
27 pages
Unit 2
No ratings yet
Unit 2
146 pages
Chapter-5-Relational Database Design
No ratings yet
Chapter-5-Relational Database Design
25 pages
Week 5 Normalization Complete Aa
No ratings yet
Week 5 Normalization Complete Aa
41 pages
II-II r23 Dbms Module-4
No ratings yet
II-II r23 Dbms Module-4
103 pages
Unit 1
No ratings yet
Unit 1
4 pages
Normalization Part1s
No ratings yet
Normalization Part1s
72 pages
Normalization
No ratings yet
Normalization
145 pages
Types of Functional Dependencies in DBMS
No ratings yet
Types of Functional Dependencies in DBMS
8 pages
Unit 4
No ratings yet
Unit 4
16 pages
DBMS Relational Calculus
No ratings yet
DBMS Relational Calculus
9 pages
Rdbms Unit III
No ratings yet
Rdbms Unit III
16 pages
Chapter 4 - Database Design - (Normalization)
No ratings yet
Chapter 4 - Database Design - (Normalization)
43 pages
Unit-3 (Database Design and Normalization)
No ratings yet
Unit-3 (Database Design and Normalization)
18 pages
Database Normalization
No ratings yet
Database Normalization
28 pages
Ch-08 Relational Database Design
No ratings yet
Ch-08 Relational Database Design
48 pages
Dbms Mod 4 Notes
No ratings yet
Dbms Mod 4 Notes
16 pages
DBMS - Unit 4-L1
No ratings yet
DBMS - Unit 4-L1
17 pages
Lec 5
No ratings yet
Lec 5
12 pages
Functional Dependency
No ratings yet
Functional Dependency
47 pages
Module-Iv Normalization of Database Tables Database Tables and Normalization
No ratings yet
Module-Iv Normalization of Database Tables Database Tables and Normalization
17 pages
DBMS ch-4
No ratings yet
DBMS ch-4
21 pages
Dbms II Bca F Chapter 3
No ratings yet
Dbms II Bca F Chapter 3
20 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
90 pages
Chapter 4
No ratings yet
Chapter 4
45 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
276 pages
Iii Unit
No ratings yet
Iii Unit
13 pages
DBMS Unit 3.0 Functional Dependencies
No ratings yet
DBMS Unit 3.0 Functional Dependencies
44 pages
CH - 5 FD and Normalization
No ratings yet
CH - 5 FD and Normalization
44 pages
Relational Algebra
No ratings yet
Relational Algebra
16 pages
Completed Unit 2 Dbms
No ratings yet
Completed Unit 2 Dbms
34 pages
DBMS 3.1 PDF
No ratings yet
DBMS 3.1 PDF
75 pages
From Simple IO to Monad Transformers
From Everand
From Simple IO to Monad Transformers
J Adrian Zimmer
2/5 (1)
Ay 2020 2021
No ratings yet
Ay 2020 2021
244 pages
Transaction Management SQL Isolation Level: 1. Read Committed
No ratings yet
Transaction Management SQL Isolation Level: 1. Read Committed
2 pages
DBMS Mid-2 Question Bank
No ratings yet
DBMS Mid-2 Question Bank
2 pages
Info Written Exam 20156autum
No ratings yet
Info Written Exam 20156autum
2 pages
DBMS UNIT 5 Part 2
No ratings yet
DBMS UNIT 5 Part 2
97 pages
Distributed Database Management System (PEC-IT601B)
No ratings yet
Distributed Database Management System (PEC-IT601B)
2 pages
Week 1
No ratings yet
Week 1
45 pages
Chapter 7 - Part 1
No ratings yet
Chapter 7 - Part 1
157 pages
Dbms Course File
No ratings yet
Dbms Course File
43 pages
University of Mysore Directorate of Outreach and Online Programs
No ratings yet
University of Mysore Directorate of Outreach and Online Programs
15 pages
MCA2013 IISem Syllabus
No ratings yet
MCA2013 IISem Syllabus
9 pages
DDBS
No ratings yet
DDBS
19 pages
Topic 3 Concurrency Control
No ratings yet
Topic 3 Concurrency Control
40 pages
Spring Transactions: By, Srinivas Reddy.S
No ratings yet
Spring Transactions: By, Srinivas Reddy.S
20 pages
Concurrency Mutual Exclusion and Synchronization
No ratings yet
Concurrency Mutual Exclusion and Synchronization
32 pages
CS8492 /database Management Systems 2017 Regulations
No ratings yet
CS8492 /database Management Systems 2017 Regulations
20 pages
05 - synchCritSec Args HW
No ratings yet
05 - synchCritSec Args HW
47 pages
Chapter-6 - Transactions-Concurrency and Recovery
No ratings yet
Chapter-6 - Transactions-Concurrency and Recovery
42 pages
Lab-12 - Problems On Serializability, Conflict Serializability, View Serializability
No ratings yet
Lab-12 - Problems On Serializability, Conflict Serializability, View Serializability
4 pages
ADBMS Chapter 3
No ratings yet
ADBMS Chapter 3
38 pages
Database Transaction: Management
No ratings yet
Database Transaction: Management
21 pages
Process Synchronization-Race Conditions - Critical Section Problem - Peterson's Solution
No ratings yet
Process Synchronization-Race Conditions - Critical Section Problem - Peterson's Solution
44 pages
B. Sc. (Information Technology) Semester - III
No ratings yet
B. Sc. (Information Technology) Semester - III
2 pages
Subject: Computer Science Syllabus: Unit I Computer System Architecture
No ratings yet
Subject: Computer Science Syllabus: Unit I Computer System Architecture
5 pages
Lecture 13: Locks: Mythili Vutukuru IIT Bombay
No ratings yet
Lecture 13: Locks: Mythili Vutukuru IIT Bombay
12 pages
IPM Individual Assignment1
No ratings yet
IPM Individual Assignment1
16 pages
BTech CS 4th Year Syllabus
No ratings yet
BTech CS 4th Year Syllabus
19 pages
Data Base Management System - Unit 10 - Week 7
No ratings yet
Data Base Management System - Unit 10 - Week 7
7 pages