Unit-3 - DBMS
Unit-3 - DBMS
SCSA1301
1
SCSA1301 DATABASE MANAGEMENT SYSTEMS
UNIT 1 INTRODUCTION TO DATABASES
UNIT 2 DATABASE DESIGN
UNIT 3 QUERY PROCESSING
UNIT 4 RECOVERY AND SECURITY
UNIT 5 OBJECT DATABASE AND CURRENT TREND
UNIT 3
QUERY PROCESSING
UNIT 2 DATABASE DESIGN
5
Structured Query Languages
Data Definition Language(DDL) – Consists of commands
which are used to define the database
Data Manipulation Language(DML) – Consists of
commands which are used to manipulate the data present
in the database
Data Control Language(DCL) – Consists of commands
which deal with the user permissions and controls of the
database system
Transaction Control Language(TCL) – Consist of commands
which deal with the transaction of the database
6
Data Definition Language (DDL)
SQL commands
CREATE - This command is used to create the database or its
objects i.e. Tables
DROP - This command is used to delete tables from the
database
TRUNCATE - This is used to remove all records from a table,
including all spaces allocated for the records are removed
ALTER - This is used to alter the structure of the database
RENAME - This is used to rename a table existing in the
database
7
DDL - CREATE
CREATE TABLE
This statement is used to create a table
Syntax
CREATE TABLE TableName ( Column1 datatype, Column2
datatype, Column3 datatype, .... ColumnN datatype );
Example
CREATE TABLE Employee_Info
(EmployeeID int, EmployeeName varchar(20),
EmergencyContactName varchar(20), PhoneNumber int,
Address varchar(50), City varchar(10), Country
varchar(10));
8
DDL - CREATE
CREATE TABLE using another TABLE
Example
CREATE TABLE ExampleTable AS
SELECT EmployeeName, PhoneNumber
FROM Employee_Info;
9
DDL - DROP
This statement is used to drop (delete) an existing table or
a database with data
Syntax
DROP DATABASE Database_Name;
Example
DROP DATABASE Employee;
Syntax
DROP TABLE Table_Name;
Example
DROP Table Employee_Info
10
DDL - TRUNCATE
This command is used to delete the information(records)
present in the table but does not delete the structure of
the table. So, once you use this command, your
information will be lost, but not the table.
Syntax
TRUNCATE TABLE TableName;
Example
TRUNCATE Table Employee_Info;
11
DDL - ALTER
This command is used to delete, modify or add constraints or
columns in an existing table
The ‘ALTER TABLE’ Statement with ADD/DROP COLUMN
Syntax
ALTER TABLE TableName ADD ColumnName Datatype;
ALTER TABLE TableName DROP COLUMN ColumnName;
Example
ADD Column BloodGroup:
ALTER TABLE Employee_Info ADD BloodGroup varchar(255);
DROP Column BloodGroup:
ALTER TABLE Employee_Info DROP COLUMN BloodGroup;
12
DDL - RENAME
This command is used to rename a table with the help of
ALTER TABLE
Syntax
ALTER TABLE table_name RENAME TO new_table_name;
Example
ALTER TABLE Student RENAME TO Student_Details;
ALTER TABLE Student RENAME COLUMN NAME TO
FIRST_NAME;
13
Data Manipulation Language
(DML)
SQL commands
SHOW – This command is used to list database and database
tables
USE – This command is used to select the database
INSERT – This command is used to insert data into a table
UPDATE – This command is used to update existing data
within a table
DELETE – This command is used to delete records from a
table
SELECT – This command is used to retrieve data from the
database table
14
DDL - SHOW
This command is used to list out the available database
Syntax
SHOW databases;
SHOW tables;
15
DML - USE
The USE statement is used to select the database on which
you want to perform operations
Syntax
USE DatabaseName;
Example
USE Employee;
16
DML - INSERT
This statement is used to insert new records into the database
table
Syntax
INSERT INTO TableName (Column1, Column2,
Column3, ...,ColumnN) VALUES (value1, value2, value3, ...);
Example
INSERT INTO Employee_Info(EmployeeID, EmployeeName,
Emergency ContactName, PhoneNumber, Address, City, Country)
VALUES (‘10', ‘Arun’,’Kamal', '9921321141', ‘No 12 Rich Street',
'Chennai', 'India’);
INSERT INTO Employee_Info VALUES (‘10', ‘Arun’,’Kamal',
'9921321141', ‘No 12 Rich Street', 'Chennai', 'India’);
17
DML - UPDATE
This statement is used to modify the records already
present in the table
Syntax
UPDATE TableName SET Column1 = Value1, Column2 =
Value2, ... WHERE Condition;
Example
UPDATE Employee_Info SET EmployeeName = ‘Asha',
City= ‘Hyderabad’ WHERE EmployeeID = 1;
18
DML - DELETE
This statement is used to delete the existing records in a
table
Syntax
DELETE FROM TableName WHERE Condition;
Example
DELETE FROM Employee_Info WHERE
EmployeeName='Preeti‘;
19
DML - SELECT
This statement is used to select data from a database and
the data returned is stored in a result table, called
the result-set.
Syntax
SELECT Column1, Column2, ...ColumN FROM TableName;
* is used to select all the columns from the table
SELECT * FROM table_name;
Example
SELECT EmployeeID, EmployeeName FROM Employee_Info;
* is used to select all from the table
SELECT * FROM Employee_Info;
20
Data Control Language(DCL)
SQL commands
GRANT - allow users access privileges to the database
REVOKE - withdraw users access privileges given by using
the GRANT command
21
DCL - GRANT
This command is employed to grant a privilege to a user. GRANT
command allows specified users to perform specified tasks
Syntax
GRANT privileges_names ON object TO user;
Example
CREATE USER ‘UserName’
USE DatabaseName;
GRANT SELECT ON TableName TO ‘UserName’
GRANT SELECT, UPDATE ON TableName TO ‘UserName’
GRANT ALL ON TableName TO ‘UserName’
22
DCL - REVOKE
It is employed to remove a privilege from a user. REVOKE helps
the owner to cancel previously granted permissions
Syntax
REVOKE privilege_name on objectname from user;
Example
REVOKE SELECT ON TableName TO ‘UserName’;
REVOKE ALL ON TableName TO ‘UserName’;
23
Transaction Control Language(TCL)
SQL Commands
COMMIT – This command is used to save data permanently
24
TCL - COMMIT
This command is used to save the data permanently into
the database
Whenever any DDL commands like -INSERT, DELETE or
UPDATE are performed, these can be rollbacked if the data
is not stored permanently
Syntax
COMMIT;
25
TCL - ROLLBACK
This command is used to get the data or restore the data
to the last savepoint or last committed state
If due to some reasons the data inserted, deleted or
updated is not correct, you can rollback the data to a
particular savepoint or if save point is not done, then to the
last committed state
Syntax
ROLLBACK;
26
TCL - SAVEPOINT
This command is used to save the data at a particular
point temporarily so that whenever needed can be rollback
to that particular point
Syntax
SAVEPOINT name;
27
EMBEDDED SQL
Database languages are meant for dealing with databases
viz., defining, constructing, manipulating, sharing
All these transactions can be performed by a good
database developer or a SQL programmer
But in real-time, these developers or programmers will not
use the database. The actual use of DB is for the normal
user for whom DB is a black box
In order to make it easy for the normal user, applications,
UI or forms are created where the user can enter values or
requirements and the application program will manipulate
the request
28
EMBEDDED SQL
Applications are developed using some general-purpose
languages like C, C++, JAVA, PYTHON, PHP, etc.
These applications use UI to interact with the user request
but there is no relevancy with the DB
Both application language and DB language are completely
different from each other
This gap between application programs and SQL is bridged
by the use of embedded SQL
29
EMBEDDED SQL -
C
30
EMBEDDED SQL -C
In this example:
The code opens the database or creates it if it doesn't exist.
An SQL query is defined to retrieve names of employees with salaries
greater than $50,000.
The sqlite3_exec function is used to execute the query. It takes a callback
function (callback in this case) that handles the results.
The callback function is called for each row returned by the query. It prints
the column names and values for each row.
After executing the query, the database is closed.
Remember that this is a basic example to demonstrate the concept of
embedded SQL in C. In real-world scenarios, you would need to handle
error checking more comprehensively, manage memory properly, and
31
handle more complex queries and operations.
EMBEDDED SQL
Embedded SQL is the one that combines the high-level
language with the DB language. It allows the application
languages to communicate with DB and get requested
result
The high-level languages which support embedding SQLs
within it are also known as the host language
Embedded SQL statements are SQL statements written
inline with the program source code of the host language
32
EMBEDDED SQL - PYTHON
Python needs a MySQL driver to access the MySQL
database
Download and install "MySQL Connector“
python -m pip install mysql-connector-python
33
EMBEDDED SQL - PYTHON
A cursor is a temporary memory allocated by the Database Server at the time of performing
operations on the database by User
34
EMBEDDED SQL - PYTHON
35
EMBEDDED SQL - PYTHON
36
EMBEDDED SQL - PYTHON
37
Query Processing and Optimization
38
Query processing
Query Processing is the activity performed in extracting data from
the database.
In query processing, it takes various steps for fetching the data
from the database.
Query processing involves 3 basic steps
39
Query Processing & Optimization
Query processing – activities involved in retrieving
data from the database
SQL query translation from high level to the low-level
language implementing relational algebra and then
query execution
Query optimization – selection of an efficient query
execution plan
40
Query Processing & Optimization
Before processing the SQL query the system must
translate the query into a language that the system
understands
Hence the more useful internal representation is based
on the relational algebra
41
Query Processing & Optimization
42
Query Processing & Optimization
Internal form &check syntax
and relations
Best expression
High level query
43
Query Processing & Optimization
Parser – Checks for the syntax and verifies the relation
mentioned in the query with the actual database
Translator – translates the query into relational algebra
expressions
Optimizer – collects the statistical information about the query
like CPU cycles, memory access time, etc., and estimates the
cost of the query
Execution plan – informs the evaluation engine on how to
perform query execution based on the inputs from the optimizer
Evaluation engine – access the data from the database and
generates the required result
44
Query Processing & Optimization
Example:
SELECT marks
FROM student
WHERE marks > 90;
This query can be translated into the following relational
algebra expressions
σ marks>90(∏marks(student))
∏marks(σ marks>90(student))
45
Query Processing & Optimization
Consider the relational algebra expression for the query
Find the names of the bank customers who have an
account at any branch located in Chennai
Type-1
∏customer_name(σbranch_city=‘Chennai’(branch x (account x depositor)))
Type-2
∏customer_name((σbranch_city=‘Chennai’ (branch)) x (account x depositor))
46
Transaction processing
Concepts
47
Transaction processing concept
A transaction is a program including a collection of database
operations, executed as a logical unit of data processing
The operations performed in a transaction include one or more
database operations like insert, delete, update or retrieve data
It is an atomic process that is either performed into completion
entirely or is not performed at all
A transaction involving only data retrieval without any data
update is called a read-only transaction
48
Transaction processing concept
Each high-level operation can be divided into a number of
low-level tasks or operations
read_item() − reads data item from storage to main
memory
modify_item() − change value of an item in the main
memory
write_item() − write the modified value from main memory
to storage
49
Transaction operations
The low-level operations performed in a transaction are
begin_transaction − A marker that specifies the start of transaction
execution
read_item or write_item − Database operations that may be
interleaved with main memory operations as a part of the transaction
end_transaction − A marker that specifies the end of the transaction
commit − A signal to specify that the transaction has been
successfully completed in its entirety and will not be undone
rollback − A signal to specify that the transaction has been
unsuccessful and so all temporary changes in the database are
undone. A committed transaction cannot be rolled back
50
Transaction states
A transaction may go through a subset of five states,
active, partially committed, committed, failed, and aborted
Active − The initial state where the transaction enters is
the active state. The transaction remains in this state while
it is executing read, write, or other operations
Partially Committed − The transaction enters this state
after the last statement of the transaction has been
executed
Committed − The transaction enters this state after
successful completion of the transaction and system
checks have issued a commit signal
51
Transaction states
Failed − The transaction goes from partially committed
state or active state to failed state when it is discovered
that normal execution can no longer proceed or system
checks fail
Aborted − This is the state after the transaction has been
rolled back after failure and the database has been
restored to its state that was before the transaction began
52
Transaction states
53
ACID Properties
Any transaction must maintain the ACID properties, viz. Atomicity,
Consistency, Isolation, and Durability
54
Concurrency control
techniques
55
Concurrency control
techniques
56
Concurrency Control in DBMS is a procedure of managing
simultaneous transactions ensuring their atomicity, isolation,
consistency, and serializability.
Several problems that arise when numerous transactions execute
access and modify the same data without interfering with each
other's operations. It helps to prevent data inconsistencies and
anomalies that can occur when multiple transactions try to access
57
and modify the same data concurrently.
Concurrency Control
What is Database Concurrency Definition
Database concurrency is a unique characteristic enabling two or
more users to retrieve information from the database at the same
time without affecting data integrity.
60
Concurrency Control Protocols
Lock-Based Protocols
Two Phase Locking Protocol
Timestamp-Based Protocols
Validation-Based Protocols
61
Lock-based Protocols
Lock Based Protocol in DBMS is a mechanism in which a
transaction cannot Read or Write the data until it acquires an
appropriate lock
Lock based protocols help to eliminate the concurrency problem
in DBMS for simultaneous transactions by locking or isolating a
particular transaction to a single user
A lock is a data variable that is associated with a data item
All lock requests are made to the concurrency-control manager.
Transactions proceed only once the lock request is granted
62
Lock-based Protocols
Shared Lock (S) - A shared lock is also called a Read-only
lock. With the shared lock, the data item can be shared
between transactions.
This is because you will never have permission to update
data on the data item
For example, consider a case where two transactions are
reading the account balance of a person. The database will
let them read by placing a shared lock. However, if another
transaction wants to update that account’s balance, a
shared lock prevents it until the reading process is over
63
Lock-based Protocols
Exclusive Lock (X) - a data item can be read as well as
written. This is exclusive and can’t be held concurrently on
the same data item
X-lock is requested using lock-x instruction. Transactions
may unlock the data item after finishing the ‘write’
operation
For example, when a transaction needs to update the
account balance of a person. You can allow this transaction
by placing an X lock on it. Therefore, when the second
transaction wants to read or write, exclusive lock prevents
this operation
64
Starvation and Deadlock
Starvation is the situation when a transaction needs to
wait for an indefinite period to acquire a lock
65
Two-Phase Locking Protocol
Two Phase Locking Protocol also known as 2PL protocol is a
method of concurrency control in DBMS that ensures
serializability by applying a lock to the transaction data
which blocks other transactions to access the same data
simultaneously
This locking protocol divides the execution phase of a
transaction into three different parts
66
Two-Phase Locking Protocol
In the first phase, when the transaction begins to execute,
it requires permission for the locks it needs
The second part is where the transaction obtains all the
locks. When a transaction releases its first lock, the third
phase starts
In this third phase, the transaction cannot demand any
new locks. Instead, it only releases the acquired locks
67
Two-Phase Locking Protocol
Strict-Two phase locking system is almost similar to 2PL.
The only difference is that Strict-2PL never releases a lock after
using it. It holds all the locks until the commit point and releases
all the locks at one go when the process is over
In Centralized 2 PL, a single site is responsible for the lock
management process. It has only one lock manager for the
entire DBMS
Primary copy 2PL mechanism, many lock managers are
distributed to different sites. After that, a particular lock
manager is responsible for managing the lock for a set of data
items. When the primary copy has been updated, the change is
propagated to the slaves
68
Two-Phase Locking Protocol
Distributed 2PL - In this kind of two-phase locking
mechanism, Lock managers are distributed to all sites.
They are responsible for managing locks for data at that
site. If no data is replicated, it is equivalent to primary
copy 2PL
69
Timestamp-based Protocol
Timestamp based Protocol in DBMS is an algorithm that
uses the System Time or Logical Counter as a timestamp
to serialize the execution of the concurrent transactions
The Timestamp-based protocol ensures that every
conflicting read and write operations are executed in a
timestamp order
The older transaction is always given priority in this
method. It uses system time to determine the timestamp
of the transaction. This is the most commonly used
concurrency protocol
70
Validation based Protocol
Validation based Protocol in DBMS also known as
Optimistic Concurrency Control Technique is a method to
avoid concurrency in transactions
In this protocol, the local copies of the transaction data are
updated rather than the data itself, which results in less
interference while execution of the transaction
71
Validation based Protocol
The Validation based Protocol is performed in the following
three phases:
Read Phase - the data values from the database can be
read by a transaction but the write operation or updates are
only applied to the local data copies, not the actual database
Validation Phase - the data is checked to ensure that there
is no violation of serializability while applying the transaction
updates to the database
Write Phase - the updates are applied to the database if
the validation is successful, else; the updates are not
applied, and the transaction is rolled back
72