0% found this document useful (0 votes)
1 views

Module4 PartA

Uploaded by

justice.chitra.v
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Module4 PartA

Uploaded by

justice.chitra.v
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 113

CSE2005

Database Management Systems


Module 4:
QUERY PROCESSING AND
TRANSACTION PROCESSING

1
Module 4
Module:4 QUERY PROCESSING AND
TRANSACTION PROCESSING
Translating SQL Queries into Relational Algebra –
heuristic query optimization – Introduction to
Transaction Processing – Transaction and System
concepts - Desirable properties of Transactions –
Characterizing schedules based on recoverability –
Characterizing schedules based on serializability

2
outline

 Relational Algebra
 Unary Relational Operations
 Relational Algebra Operations
From Set Theory
 Binary Relational Operations
 Additional Relational
Operations
 Examples of Queries in
Relational Algebra

3
Relational Algebra

 The basic set of operations for the


relational model is known as the
relational algebra.
 These operations enable a user to
specify basic retrieval requests.
 The result of a retrieval is a new
relation, which might be been
formed from one or more
relations.
 A sequence of relational algebra
operations forms a relational
algebra expression

4
Unary Relational Operations

 SELECT (symbol: σ)
 PROJECT (symbol: π)
 RENAME (symbol: ρ)

5
Example Schema for illustration

6
SELECT (σ)

 The SELECT operation is used for


selecting a subset of the tuples
according to a given selection
condition.
 Sigma(σ)Symbol denotes it.
 It is used as an expression to choose
tuples which meet the selection
condition.

7
Select Operation - Example

Select operator selects tuples that


satisfy a given predicate.
σp(r)
σ is the predicate
r stands for relation which is
the name of the table
p is prepositional logic
SALARY > 30,000 (EMPLOYEE)
Output - Selects tuples from Employee relation where
SALARY > ‘300000'.

8
Projection(π)

 The projection eliminates all attributes


of the input relation but those
mentioned in the projection list.
 This helps to extract the values of
specified attributes to eliminates
duplicate values.
 (pi) symbol is used to choose attributes
from a relation.
Example: To list each employee’s first and last name
and salary, the following is used:

  LNAME,FNAME,SALARY (EMPLOYEE)

9
Illustration – Select and Project
Operations for Company Schema

10
Rename (ρ)

 Rename is a unary operation


used for renaming attributes of a
relation.

 ρ (a/b)R will rename the attribute


'b' of relation by 'a'.

11
Rename (ρ)
(contd.)

 Example:
 FNAME, LNAME, SALARY ( DNO=5(EMPLOYEE))
OR We can explicitly show the
sequence of operations, giving a name to each
intermediate relation:

DEP5_EMPS   DNO=5(EMPLOYEE)
RESULT   FNAME, LNAME, SALARY
(DEP5_EMPS)

12
Relational Algebra Operations From
Set Theory

 UNION (υ)
 INTERSECTION ( ),
 DIFFERENCE (-)
 CARTESIAN PRODUCT ( x )

13
Union operation (υ)

 UNION is symbolized by ∪
symbol.
 It includes all tuples that are in
tables A or in B.
 It also eliminates duplicate tuples.
 set A UNION set B would be
expressed as:
The result <- A ∪ B

14
Union operation (υ)
contd.

 For a union operation to be valid,


the following conditions must hold
-
 R and S must be the same number
of attributes.
 Attribute domains need to be
compatible.
 Duplicate tuples should be
automatically removed.

15
Example
 To retrieve the social security
numbers of all employees who
work in department 5 (Result 1
below) or directly supervise an
employee who works in
department 5 (Result 2 below)

16
Set Difference (-)

 - Symbol denotes it. The result of


A - B, is a relation which includes
all tuples that are in A but not in B.
 The attribute name of A has to
match with the attribute name in
B.
 The two-operand relations A and
B should be either compatible or
Union compatible.

17
Intersection

 An intersection is defined by the


symbol ∩
A∩B
 Defines a relation consisting of a
set of all tuple that are in both A
and B.
 However, A and B must be union-
compatible.

18
Relational Algebra Operations From
Set Theory (cont.)

19
Cartesian product(X)

 This type of operation is helpful to


merge columns from two
relations.
 It becomes meaningful when it is
followed by other operations.
Example:

FEMALE_EMPS   SEX=’F’(EMPLOYEE)
EMPNAMES   FNAME, LNAME, SSN (FEMALE_EMPS)

EMP_DEPENDENTS  EMPNAMES x DEPENDENT

20
Join Operations

 Join operation is essentially a Cartesian product


followed by a selection criterion.

 Join operation is denoted by ⋈.

 JOIN operation also allows joining variously related


tuples from different relations.

21
Types of joins
 Various forms of join
operation are:
 Inner Joins
 Theta join
 EQUI join
 Natural join
 Outer joins
 Left Outer Join
 Right Outer Join
 Full Outer Join

22
SQL Joins
• SQL Join is used to fetch data from two or more tables,
which is joined to appear as single set of data.
• It is used for combining column from two or more
tables by using values common to both tables.
• Types of Join
• Inner
• Outer (Left, Right)
• Cross
• Natural
• Cartesian
Different Types of SQL JOIN’s
Inner Join
• The INNER JOIN keyword selects records that have matching values in
both tables.
Explicit Inner Join and Cartesian
• The INNER JOIN keyword selects records that have matching values in
both tables.
Implicit Inner Join and Cartesian

• The INNER JOIN keyword selects records that have matching values in
both tables.

SELECT * FROM CLASS, CLASSINFO


WHERE
CLASS.ID = CLASSINFO.ID;
Equi Join
• The INNER JOIN keyword selects records that have matching values in
both tables.
Natural Join
• The INNER JOIN keyword selects records that have matching
values in both tables.
Cross Join

• The INNER JOIN keyword selects records that have matching values in
both tables.

2 BBB 1
CHENNAI
2 BBB 2
MUMBAI
4 DDD 1
CHENNAI
2 BBB 3
DELHI
4 DDD 2
MUMBAI
4 DDD 3 DELHI
Cross Join (with Equality)

• The INNER JOIN keyword selects records that have matching values in
both tables.

2 BBB 2 MUMBAI
Left Outer Join

• LEFT (OUTER) JOIN: Returns all records from the left table, and the
matched records from the right table

2 BBB
4
MUMBAI DDD
Right Outer Join

• RIGHT (OUTER) JOIN: Returns all records from the right table, and the
matched records from the left table

2 BBB MUMBAI
3 DEHLI
Full Outer Join
Inner Join

 In an inner join, only those tuples that


satisfy the matching criteria are included,
while the rest are excluded.

35
Inner join Type : Theta Join

 The general case of JOIN operation is called


a Theta join.

 It is denoted by symbol θ

 Example: A ⋈θ B

Theta join can use any conditions in the


selection criteria.

36
Join operation on Department and
Employee relations

DEPT_MGR  DEPARTMENT MGRSSN=SSN


EMPLOYEE

37
Inner join Type : EQUI join

 When a theta join uses only equivalence


condition, it becomes a equi join.
 For example: A ⋈ A.column 2 = B.column 2 (B)

38
Natural join (⋈)

 Natural join can only be performed if there is a


common attribute (column) between the relations.

 The name and type of the attribute must be same.

39
Natural join - Example

40
OUTER JOIN

 In an outer join, along with tuples that satisfy the


matching criteria, all tuples that do not match the
criteria are also included.

41
Left Outer Join

• In the left outer join, operation allows


keeping all tuple in the left relation.

• If there is no matching tuple found in


right relation, the attributes of right
relation in the join result are filled with
null values.

42
Left Outer Join- Example

43
Right Outer Join

 In the right outer join operation, all tuples in the


right relation are retained.

 If there is no matching tuple found in the left


relation, then the attributes of the left relation in
the join result are filled with null values.
Right Outer Join -
Example

45
Full Outer Join

 In a full outer join, all tuples from both


relations are included in the result,
irrespective of the matching condition.

46
Full Outer Join - Example

47
Additional Relational
Operations

 Aggregate Functions and Grouping

 Common functions applied to collections of numeric


values

 It includes SUM, AVERAGE, MAXIMUM, and MINIMUM

 The COUNT function is used for counting tuples or values.

48
Additional Relational
Operations (cont.)
Use of the Functional operator ℱ

ℱMAX Salary (Employee)


retrieves the maximum salary value from the
Employee relation

ℱMIN Salary (Employee)


retrieves the minimum Salary value from the
Employee relation
ℱSUM Salary (Employee)
retrieves the sum of the Salary from the Employee
relation
49
Query Optimization

50
Introduction to Query
Processing

 Query optimization: the process of choosing a


suitable execution strategy for processing a
query.

 Two internal representations of a query


 Query Tree
 Query Graph

51
Basic Steps in Query Processing

1. Parsing and
translation
2. Optimization
3. Evaluation

52
Basic Steps in Query Processing
(Cont.)
 Process for heuristics optimization

1. The parser of a high-level query generates an initial


internal representation;
2. Apply heuristics rules to optimize the internal
representation.
3. A query execution plan is generated to execute
groups of operations based on the access paths
available on the files involved in the query.

 The main heuristic is to apply first the operations


that reduce the size of intermediate results.
E.g., Apply SELECT and PROJECT operations before
applying the JOIN or other binary operations.

53
Query Representation
 Query tree:
 A tree data structure that corresponds to a relational
algebra expression.
 It represents the input relations of the query as leaf nodes
of the tree and the relational algebra operations as
internal nodes.
 An execution of the query tree consists of executing an
internal node operation whenever its operands are
available and then replacing that internal node by the
relation that results from executing the operation.
 Query graph:
 A graph data structure that corresponds to a relational
calculus expression.
 It does not indicate an order on which operations to
perform first. There is only a single graph corresponding
to each query.
54
1. Translating SQL Queries into
Relational Algebra
 Query block: the basic unit that can be
translated into the algebraic operators
and optimized.
 A query block contains a single SELECT-
FROM-WHERE expression, as well as
GROUP BY and HAVING clause if these
are part of the block.
 Nested queries within a query are
identified as separate query blocks.
 Aggregate operators in SQL must be
included in the extended algebra.

55
Translating SQL Queries into
Relational Algebra- Example

56
Example
 Example:
For every project located in ‘Stafford’,
retrieve the project number, the
controlling department number and
the department manager’s last name,
address and birthdate.

Relation algebra:
PNUMBER, DNUM, LNAME, ADDRESS, BDATE
(((PLOCATION=‘STAFFORD’(PROJECT))
DNUM=DNUMBER (DEPARTMENT))
MGRSSN=SSN (EMPLOYEE))

SQL query:
Q2: SELECT P.NUMBER,P.DNUM,E.LNAME,
E.ADDRESS, E.BDATE FROM PROJECT AS
P,DEPARTMENT AS D, EMPLOYEE AS E WHERE
P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND
P.PLOCATION=‘STAFFORD’;

57
Query tree

58
Query graph

59
2. Heuristic Optimization
Heuristic Optimization of Query Trees:
 The same query could correspond to
many different relational algebra
expressions — and hence many
different query trees.

 The task of heuristic optimization of


query trees is to find a final query tree
that is efficient to execute.

60
Steps in optimizing a Query tree
1. Moving SELECT operations down the query tree
2. Applying the more restrictive SELECT operation
first
3. Replacing CARTESIAN PRODUCT and
SELECT with JOIN operations
4. Moving the PROJECT operations down the tree
Example:
Q: SELECT LNAME
FROM EMPLOYEE,
WORKS_ON, PROJECT
WHERE PNAME =
‘AQUARIUS’ AND PNMUBER=PNO AND
ESSN=SSN AND BDATE > ‘1957-12-31’;

61
Steps in optimizing a Query tree-
Illustration

62
Steps in optimizing a Query tree
(Contd.)

63
Steps in optimizing a Query tree
(Contd.)

64
Using Heuristics in Query
Optimization –Transformation rules

65
Transformation rules (Cont.)

66
Transformation rules (Cont.)

67
Transformation rules (Cont.)

68
Transaction Processing

69
Transaction - Definition

 A transaction is a unit of program


execution that accesses and possibly
updates various data items.

Image Source: https://toppng.com/icon-phone-transactions-transprent-free-mobile-


70
transaction-icon-PNG-free-PNG-Images_128650?search-result=cell-phone-icon
Transaction - Example

 E.g. transaction to transfer $50 from account A to account B:


1.read(A)
2.A := A – 50
3.write(A)
4.read(B)
5.B := B + 50
6.write(B)

Image Source: https://www.telusinternational.com/articles/digital-


71
transformation-banking-pioneers
Transaction - Example

 Two main issues to deal with:


 Failures of various kinds, such as hardware
failures and system crashes
 Concurrent execution of multiple
transactions

Image Source: https://www.telusinternational.com/articles/digital-


72
transformation-banking-pioneers
Transaction Properties –
ACID Properties
 Atomicity. Either all operations of the transaction are
properly reflected in the database or none are.
 Consistency. Execution of a transaction in isolation
preserves the consistency of the database.
 Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions. Intermediate
transaction results must be hidden from other concurrently
executed transactions.
 That is, for every pair of transactions Ti and Tj, it
appears to Ti that either Tj, finished execution before Ti
started, or Tj started execution after Ti finished.
 Durability. After a transaction completes successfully, the
changes it has made to the database persist, even if there
are system failures.

73
Transaction Properties –
ACID Properties

Image Source: http://answers.mindstick.com/qa/51741/what-is-acid-property-


74
in-dbms-and-also-define-the-concept-state-of-transaction
Transaction Properties -
Example of Fund Transfer
 Transaction to transfer $50 from account A to account B:
1.read(A)
2.A := A – 50
3.write(A)
4.read(B)
5.B := B + 50
6.write(B)
 Atomicity requirement:
 If the transaction fails after step 3 and before step 6,
money will be “lost” leading to an inconsistent database
state.
 Failure could be due to software or hardware.
 The system should ensure that updates of a partially
executed transaction are not reflected in the database.

75
Transaction Properties -
Example of Fund Transfer
 Transaction to transfer $50 from account A to account B:
1.read(A)
2.A := A – 50
3.write(A)
4.read(B)
5.B := B + 50
6.write(B)
 Durability requirement:
 Once the user has been notified that the transaction
has been completed (i.e., the transfer of the $50 has
taken place), the updates to the database by the
transaction must persist even if there are software or
hardware failures.

76
Transaction Properties -
Example of Fund Transfer
 Consistency requirement:
 The sum of A and B is unchanged by the execution
of the transaction
 In general, consistency requirements include
 Explicitly specified integrity constraints such as
primary keys and foreign keys
 Implicit integrity constraints
 e.g. sum of balances of all accounts, minus sum
of loan amounts must equal value of cash-in-
hand
 A transaction must see a consistent database.
 During transaction execution the database may be
temporarily inconsistent.
 When the transaction completes successfully the
database must be consistent
 Erroneous transaction logic can lead to
inconsistency

77
Transaction Properties -
Example of Fund Transfer
 Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated
database, it will see an inconsistent database (the sum A + B
will be less than it should be).
T1 T2
1.read(A)
2.A := A – 50
3.write(A)
read(A), read(B), print(A+B)
4.read(B)
5.B := B + 50
6.write(B)
 Isolation can be ensured trivially by running transactions
serially
 that is, one after the other.
 However, executing multiple transactions concurrently has
significant benefits.

78
Transaction States
 Active – The initial state; the transaction stays in this
state while it is executing
 Partially committed – After the final statement has
been executed.
 Failed -- After the discovery that normal execution can
no longer proceed.
 Aborted – After the transaction has been rolled back
and the database restored to its state prior to the start of
the transaction. Two options after it has been aborted:
 Restart the transaction
 Can be done only if no internal logical error
 Kill the transaction
 Committed – after successful completion.

79
Transaction States

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 80
Tata Mc Graw Hill, 2011
Implementation of Atomicity
& Durability

 The recovery-management component of a database system


implements the support for atomicity and durability.
 E.g. the shadow-database scheme:
 All updates are made on a shadow copy of the database
 db_pointer is made to point to the updated shadow copy after
 The transaction reaches partial commit and
 All updated pages have been flushed to disk.

Image Source: Database System Concepts by Abraham


Silberschatz, Henry F.Korth and S.Sudarshan, Tata Mc
Graw Hill, 2011 81
Concurrent Executions
 Multiple transactions are allowed to run concurrently in
the system. Advantages are:
 Increased processor and disk utilization,
leading to better transaction throughput
 E.g. one transaction can be using the CPU while
another is reading from or writing to the disk
 Reduced average response time for transactions:
short transactions need not wait behind long ones.
 Concurrency control schemes – mechanisms to
achieve isolation
 That is, to control the interaction among the
concurrent transactions in order to prevent them
from destroying the consistency of the database

82
Schedules
 Schedule – A sequence of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
 A schedule for a set of transactions must consist of all
instructions of those transactions
 Must preserve the order in which the instructions
appear in each individual transaction.
 A transaction that successfully completes its execution
will have a commit instructions as the last statement
 By default transaction assumed to execute commit
instruction as its last step
 A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement

83
Schedule - 1
 Let T1 transfer $50 from A to B, and T2 transfer 10%
of the balance from A to B.
 A serial schedule in which T1 is followed by T2 :

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 84
Tata Mc Graw Hill, 2011
Schedule - 2
• A serial schedule where T2 is followed by T1 is shown
below:

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 85
Tata Mc Graw Hill, 2011
Schedule - 3
 Let T1 and T2 be the transactions defined previously.
The following schedule is not a serial schedule, but it
is equivalent to Schedule 1.

 In Schedules 1, 2 and 3, the sum A + B is preserved.


86
Schedule - 4
 The following concurrent schedule does not preserve
the value of (A + B ).

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 87
Tata Mc Graw Hill, 2011
Schedule - 4
 The following concurrent schedule does not preserve
the value of (A + B ).

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 88
Tata Mc Graw Hill, 2011
Serializability
 Basic Assumption – Each transaction preserves database
consistency.
 Thus serial execution of a set of transactions preserves
database consistency.
 A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:
1.conflict serializability
2.view serializability
 Simplified view of transactions
 We ignore operations other than read and write
instructions
 We assume that transactions may perform arbitrary
computations on data in local buffers in between reads
and writes.
 Our simplified schedules consist of only read and write
instructions.

89
Conflicting Instructions
 Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there exists some
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
 Intuitively, a conflict between li and lj forces a (logical)
temporal order between them.
 If li and lj are consecutive in a schedule and they
do not conflict, their results would remain the
same even if they had been interchanged in the
schedule.

90
Conflict Serializability
 If a schedule S can be transformed into a schedule S
´ by a series of swaps of non-conflicting instructions,
we say that S and S´ are conflict equivalent.
 We say that a schedule S is conflict serializable if it
is conflict equivalent to a serial schedule

91
Conflict Serializability
 Schedule 3 can be transformed into Schedule 6, a serial
schedule where T2 follows T1, by series of swaps of non-
conflicting instructions.
 Therefore Schedule 3 is conflict serializable.

Schedule 3 Schedule 6
Image Source: Database System Concepts by Abraham Silberschatz, Henry
F.Korth and S.Sudarshan, 92
Tata Mc Graw Hill, 2011
Conflict Serializability
 Example of a schedule that is not conflict serializable:

 We are unable to swap instructions in the above schedule


to obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.

Schedule 3 Schedule 6
Image Source: Database System Concepts by Abraham Silberschatz, Henry
F.Korth and S.Sudarshan, 93
Tata Mc Graw Hill, 2011
Conflict Serializability
 Example of a schedule that is not conflict serializable:

 We are unable to swap instructions in the above schedule


to obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.

Schedule 3 Schedule 6
Image Source: Database System Concepts by Abraham Silberschatz, Henry
F.Korth and S.Sudarshan, 94
Tata Mc Graw Hill, 2011
View Serializability
 Let S and S´ be two schedules with the same set of
transactions. S and S´ are view equivalent if the
following three conditions are met, for each data item
Q,
1. If in schedule S, transaction Ti reads the initial
value of Q, then in schedule S’ also transaction
Ti must read the initial value of Q.
2. If in schedule S transaction Ti executes
read(Q), and that value was produced by
transaction Tj (if any), then in schedule S’ also
transaction Ti must read the value of Q that
was produced by the same write(Q) operation
of transaction Tj .

95
View Serializability
3. The transaction (if any) that performs the final
write(Q) operation in schedule S must also perform
the final write(Q) operation in schedule S’.
 View equivalence is also based purely on reads and writes
alone.
 A schedule S is view serializable if it is view equivalent to
a serial schedule.
 Every conflict serializable schedule is also view
serializable.
 Below is a schedule which is view-serializable but not
conflict serializable.
 Every view serializable schedule that is not conflict
serializable has blind writes.

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 96
Tata Mc Graw Hill, 2011
Other Notions of
Serializability
 The schedule below produces same outcome as the serial
schedule < T1, T5 >, yet is not conflict equivalent or view
equivalent to it.

 Determining such equivalence requires analysis of


operations other than read and write.

97
Recoverable Schedules
 Need to address the effect of transaction failures on
concurrently
running transactions.
 Recoverable schedule — if a transaction Tj reads a data
item previously written by a transaction Ti , then the commit
operation of Ti appears before the commit operation of Tj.
 The following schedule (Schedule 11) is not recoverable if
T9 commits immediately after the read.

 If T8 should abort, T9 would have read (and possibly shown to the


user) an inconsistent database state. Hence, database must
ensure that schedules are recoverable.

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 98
Tata Mc Graw Hill, 2011
Cascading Rollbacks
 Cascading rollback – a single transaction failure leads to
a series of transaction rollbacks. Consider the following
schedule where none of the transactions has yet committed
(so the schedule is recoverable)

If T10 fails, T11 and T12 must also be rolled back.


 Can lead to the undoing of a significant amount of work

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan, 99
Tata Mc Graw Hill, 2011
Cascadeless Schedules
 Cascadeless schedules — cascading rollbacks cannot
occur; for each pair of transactions Ti and Tj such that Tj
reads a data item previously written by Ti, the commit
operation of Ti appears before the read operation of Tj.
 Every cascadeless schedule is also recoverable
 It is desirable to restrict the schedules to those that are
cascadeless

100
Deadlocks
 Occur when 2 transactions exist in the following mode:
T1 = access data item X and Y
T2 = Access data items Y and X

If T1 does not unlock Y, T2 cannot begin


If T2 does not unlock X, T1 cannot continue

T1 & T2 wait indefinitely for each other to unlock data


 Deadlocks are only possible if a transaction wants an
Exclusive Lock (No Deadlocks on Shared Locks)

101
Concurrency Control
 A database must provide a mechanism that will ensure that
all possible schedules are
 either conflict or view serializable, and
 are recoverable and preferably cascadeless
 A policy in which only one transaction can execute at a time
generates serial schedules, but provides a poor degree of
concurrency
 Are serial schedules recoverable/cascadeless?
 Testing a schedule for serializability after it has executed is
a little too late!
 Goal – to develop concurrency control protocols that will
assure serializability.

102
Concurrency Control vs.
Serializability Tests
 Concurrency-control protocols allow concurrent schedules,
but ensure that the schedules are conflict/view serializable,
and are recoverable and cascadeless .
 Concurrency control protocols generally do not examine the
precedence graph as it is being created
 Instead a protocol imposes a discipline that avoids
nonseralizable schedules.
 Different concurrency control protocols provide different
tradeoffs between the amount of concurrency they allow
and the amount of overhead that they incur.
 Tests for serializability help us understand why a
concurrency control protocol is correct.

103
Weak Levels of Consistency
 Some applications are willing to live with weak
levels of consistency, allowing schedules that
are not serializable
 E.g. a read-only transaction that wants to get an
approximate total balance of all accounts
 E.g. database statistics computed for query
optimization can be approximate (why?)
 Such transactions need not be serializable with
respect to other transactions
 Tradeoff accuracy for performance

104
Levels of Consistency
 Serializable — default
 Repeatable read — only committed records to be read,
repeated reads of same record must return same value.
However, a transaction may not be serializable – it may find
some records inserted by a transaction but not find others.
 Read committed — only committed records can be read, but
successive reads of record may return different (but committed)
values.
 Read uncommitted — even uncommitted records may be
read.
• Lower degrees of consistency useful for gathering approximate
information about the database
• Warning: some database systems do not ensure serializable
schedules by default
– E.g. Oracle and PostgreSQL by default support a level of
consistency called snapshot isolation (not part of the SQL
standard)

105
Transaction Definition in SQL
 Data manipulation language must include a construct for
specifying the set of actions that comprise a transaction.
 In SQL, a transaction begins implicitly.
 A transaction in SQL ends by:
 Commit work commits current transaction and begins a
new one.
 Rollback work causes current transaction to abort.
 In almost all database systems, by default, every SQL
statement also commits implicitly if it executes successfully
 Implicit commit can be turned off by a database directive
 E.g. in JDBC, connection.setAutoCommit(false);

106
Implementation of Isolation
 Schedules must be conflict or view serializable, and
recoverable, for the sake of database consistency, and
preferably cascadeless.
 A policy in which only one transaction can execute at a time
generates serial schedules, but provides a poor degree of
concurrency.
 Concurrency-control schemes tradeoff between the amount
of concurrency they allow and the amount of overhead that
they incur.
 Some schemes allow only conflict-serializable schedules to
be generated, while others allow view-serializable
schedules that are not conflict-serializable.

107
Implementation of Isolation

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan,108
Tata Mc Graw Hill, 2011
Testing for Serializability
 Consider some schedule of a set of transactions T1, T2, ...,
Tn
 Precedence graph — a direct graph where the vertices are
the transactions (names).
 We draw an arc from Ti to Tj if the two transaction conflict,
and Ti accessed the data item on which the conflict arose
earlier.
 We may label the arc by the item that was accessed.
 Example 1

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan,109
Tata Mc Graw Hill, 2011
Example Schedule (Schedule
A) + Precedence Graph

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan,110
Tata Mc Graw Hill, 2011
Test for Conflict Serializability
 A schedule is conflict serializable if and only if its
precedence graph is acyclic.
 Cycle-detection algorithms exist which take order n2 time,
where n is the number of vertices in the graph.
 (Better algorithms take order n + e where e is the
number of edges.)
 If precedence graph is acyclic, the serializability order can
be obtained by a topological sorting of the graph.
 This is a linear order consistent with the partial order of
the graph.
 For example, a serializability order for Schedule A would
be
T5  T1  T3  T2  T4
 Are there others?

111
Test for Conflict Serializability

Image Source: Database System Concepts by Abraham Silberschatz, Henry


F.Korth and S.Sudarshan,112
Tata Mc Graw Hill, 2011
Test for View Serializability
 The precedence graph test for conflict serializability cannot
be used directly to test for view serializability.
 Extension to test for view serializability has cost
exponential in the size of the precedence graph.
 The problem of checking if a schedule is view serializable
falls in the class of NP-complete problems.
 Thus existence of an efficient algorithm is extremely
unlikely.
 However practical algorithms that just check some
sufficient conditions for view serializability can still be
used.

113

You might also like