0% found this document useful (0 votes)
12 views36 pages

Unit-IV Dbms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views36 pages

Unit-IV Dbms

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Unit-I I I

• Data base Ma nagement System


• (KCS-501)

• UN IT- IV

• Mr. Pradeep Kuma r Tripathi


Outline
• Tra nsaction Concept
• Tra nsaction State
• Concu rrent Executions
• Seria liza bility
• Recovera bility
• I mplementation of Isolation
• Tra nsaction Definition in SQL
• Testing for Seria liza bility.
Tra nsaction Concept
• A tra nsaction is a unit of prog ra m execution
that accesses a nd possibly updates va rious
data items.
• E.g., tra nsaction to tra nsfer $50 from
account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Two main issues to dea l with:
– Failures of various kinds, such as hardware
failures and system crashes
– Concurrent execution of multiple transactions
Required Properties of a Tra nsaction
• Consider a tra nsaction to tra nsfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement
– If the tra nsaction fails after step 3 a nd before step 6, money wil l be “lost” leading
to a n inconsistent data base state
• Failure could be due to softwa re or ha rdwa re
– The system should ensure that updates of a pa rtia l ly executed tra nsaction a re not
reflected in the data base
• Dura bility requirement — once the user has been notified that the tra nsaction has
completed (i.e., the tra nsfer of the $50 has ta ken place), the updates to the data base
by the tra nsaction must persist even if there a re softwa re or ha rdwa re failures.
Required Properties of a Tra nsaction (Cont.)

• Consistency requirement in a bove exa mple:


– The sum of A a nd B is uncha nged by the execution of the tra nsaction
• I n genera l, consistency requirements include
• Explicitly specified integ rity constraints such as prima ry keys a nd foreig n keys
• I mplicit integ rity constraints
– e.g., sum of ba la nces of a l l accounts, minus sum of loa n a mounts
must equa l va lue of cash-in-ha nd
• A tra nsaction, when sta rting to execute, must see a consistent data base.
• During tra nsaction execution the data base may be tempora rily inconsistent.
• When the tra nsaction completes successful ly the data base must be consistent
– Erroneous tra nsaction logic ca n lead to inconsistency
Required Properties of a Tra nsaction (Cont.)
• Isolation requirement — if between steps 3 a nd 6 (of the fund tra nsfer tra nsaction) ,
a nother tra nsaction T2 is a l lowed to access the pa rtia l ly updated data base, it wil l
see a n inconsistent data base (the sum A + B wil l be less tha n it should be).

T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Isolation ca n be ensured trivia l ly by running tra nsactions seria l ly
– That is, one after the other.
• However, executing multiple tra nsactions concurrently has sig nifica nt benefits, as we
wil l see later.
ACI D Properties
A tra nsaction is a unit of prog ra m execution that accesses a nd possibly updates va rious
data items. To preserve the i nteg rity of data the data base system must ensure:

• Atomicity. Either a l l operations of the tra nsaction a re


properly reflected in the data base or none a re.
• Consistency. Execution of a tra nsaction in isolation
preserves the consistency of the data base.
• Isolation. Althoug h multiple tra nsactions may execute
concurrently, each tra nsaction must be unawa re of
other concurrently executing tra nsactions.
Intermediate tra nsaction results must be hidden from
other concurrently executed tra nsactions.
– That is, for every pair of tra nsactions Ti a nd Tj, it
a ppea rs to Ti that either Tj, finished execution before Ti
sta rted, or Tj sta rted execution after Ti finished.
• Dura bility. After a tra nsaction completes successful ly,
the cha nges it has made to the data base persist, even
if there a re system failures.
Tra nsaction State
• Active – the initia l state; the tra nsaction stays
in this state while it is executing
• Pa rtia l ly committed – after the fina l statement
has been executed.
• Failed -- after the discovery that norma l
execution ca n no longer proceed.
• Aborted – after the tra nsaction has been rol led
back a nd the data base restored to its state
prior to the sta rt of the tra nsaction. Two
options after it has been a borted:
– Restart the transaction
• can be done only if no internal logical error
– Kil l the transaction
• Committed – after successful completion.
Tra nsaction State (Cont.)
Concu rrent Executions
• Mu ltiple transactions are al lowed to ru n
concu rrently in the system. Advantages are:
– I ncreased processor a nd disk utilization, leading
to better tra nsaction throughput
• E.g. one transaction can be using the CPU while
another is reading from or writing to the disk
– Reduced average response time for tra nsactions:
short tra nsactions need not wait behind long
ones.
• Concu rrency control schemes – mechanisms
to achieve isolation
– That is, to control the interaction a mong the
concurrent tra nsactions in order to prevent them
from destroying the consistency of the data base
• Wil l study in Chapter 15, after studying notion of
correctness of concu rrent executions.
Schedu les
• Schedule – a sequences of instructions that
specify the chronologica l order in which
instructions of concurrent tra nsactions a re
executed
– A schedule for a set of tra nsactions must consist of
a l l instructions of those tra nsactions
– Must preserve the order in which the instructions
a ppea r in each individua l tra nsaction.
• A tra nsaction that successful ly completes its
execution wil l have a commit instructions as the
last statement
– By default tra nsaction assumed to execute commit
instruction as its last step
• A tra nsaction that fails to successful ly complete
its execution wil l have a n a bort instruction as the
last statement
Schedu le 1
• Let T1 tra nsfer $50 from A to B, a nd T2 tra nsfer 10% of the ba la nce from A to B.
• An exa mple of a seria l schedule in which T1 is fol lowed by T2 :
Schedu le 2
• A seria l schedule in which T2 is fol lowed by T1 :
Schedu le 3
• Let T1 a nd T2 be the tra nsactions defined previously. The fol lowing schedule is
not a seria l schedule, but it is equiva lent to Schedule 1.

Note -- I n schedules 1, 2 a nd 3, the sum “A + B” is preserved.


Schedu le 4
• The fol lowing concurrent schedule does not preserve the sum of “A + B”
Seria liza bility
• Basic Assu mption – Each tra nsaction
preserves data base consistency.
• Thus, seria l execution of a set of
tra nsactions preserves data base
consistency.
• A (possibly concu rrent) schedu le is
seria liza ble if it is equiva lent to a seria l
schedu le. Different forms of schedu le
equiva lence give rise to the notions of:
1. conflict seria liza bility
2. view seria liza bility
Simplified view of tra nsactions
• We ig nore operations other tha n read
a nd write instructions
• We assu me that tra nsactions may
perform a rbitra ry computations on data
in loca l buffers in between reads a nd
writes.
• Ou r simplified schedu les consist of only
read a nd write instructions.
Conflicting I nstructions
• Let l a nd l be two Instructions of tra nsactions T
i j i
a nd T respectively. Instructions l a nd l conflict if
j i j
a nd only if there exists some item Q accessed by
both l a nd l , a nd at least one of these
i j
instructions wrote Q.
1. l = read(Q), l = read(Q). l a nd l don’t conflict.
i j i j
2. l = read(Q), l = write(Q). They conflict.
i j
3. l = write(Q), l = read(Q). They conflict
i j
4. l = write(Q), l = write(Q). They conflict
i j

• Intuitively, a conflict between l a nd l forces a i j


(logica l) tempora l order between them.
– If li a nd lj a re consecutive in a schedule a nd they do
not conflict, their results would remain the sa me
even if they had been intercha nged in the schedule.
Conflict Seria liza bility
• If a schedu le S ca n be tra nsformed
into a schedu le S´ by a series of
swa ps of non-conflicting instructions,
we say that S a nd S´ a re conflict
equiva lent.
• We say that a schedu le S is conflict
seria liza ble if it is conflict equiva lent
to a seria l schedu le
Conflict Seria liza bility (Cont.)
• Schedule 3 ca n be tra nsformed into Schedule 6 -- a seria l schedule where T2 fol lows
T1, by a series of swa ps of non-conflicting instructions. Therefore, Schedule 3 is
conflict seria liza ble.

Schedule 3 Schedule 6
Conflict Seria liza bility (Cont.)
• Exa mple of a schedule that is not conflict
seria liza ble:

• We a re una ble to swa p instructions in the


a bove schedule to obtain either the seria l
schedule < T3, T4 >, or the seria l schedule <
T4, T3 >.
Precedence Gra ph
• Consider some schedule of a set of
tra nsactions T , T , ..., T
1 2 n

• Precedence g ra ph — a direct g ra ph where the


vertices a re the tra nsactions (na mes).
• We d raw a n a rc from T to T if the two
i j

tra nsaction conflict, a nd T accessed the data


i

item on which the conflict a rose ea rlier.


• We may la bel the a rc by the item that was
accessed.
• Exa mple
Testing for Conflict Seria liza bility
• A schedule is conflict seria liza ble if a nd only if its
precedence g ra ph is acyclic.
• Cycle-detection a lgorithms exist which ta ke order n2 time,
where n is the number of vertices in the g ra ph.
– (Better a lgorithms ta ke order n + e where e is the
number of edges.)
• If precedence g ra ph is acyclic, the seria liza bility order ca n
be obtained by a topological sorting of the g ra ph.
– That is, a linea r order consistent with the pa rtia l order
of the g ra ph.
– For exa mple, a seria liza bility order for the schedule (a)
would be one of either (b) or (c)
Recovera ble Schedu les
• Recovera ble schedule — if a tra nsaction Tj reads a data item previously written by a
tra nsaction Ti , then the commit operation of Ti must a ppea r before the commit
operation of Tj.
• The fol lowing schedule is not recovera ble if T9 commits immediately after the read(A)
operation.

• If T8 should a bort, T9 would have read (a nd possibly shown to the user) a n


inconsistent data base state. Hence, data base must ensure that schedules a re
recovera ble.
Cascading Rol l backs
• Cascading rol l back – a sing le tra nsaction failure leads
to a series of tra nsaction rol l backs. Consider the
fol lowing schedule where none of the tra nsactions has
yet committed (so the schedule is recovera ble)

If T fails, T a nd T must a lso be rol led back.


10 11 12

• Ca n lead to the undoing of a sig nifica nt a mount of


work
Cascadeless Schedu les
• Cascadeless schedules — for each pair of
transactions Ti and Tj such that Tj reads a
data item previously written by Ti, the
commit operation of Ti appears before the
read operation of Tj.
• Every cascadeless schedule is also
recoverable
• It is desirable to restrict the schedules to
those that are cascadeless
• Example of a schedule that is NOT
cascadeless
Concu rrency Control
• A database must provide a mechanism that wil l ensure
that al l possible schedules are both:
– Conflict seria liza ble.
– Recovera ble a nd prefera bly cascadeless
• A policy in which only one transaction can execute at a
time generates serial schedules, but provides a poor
degree of concurrency
• Concurrency-control schemes tradeoff between the
amount of concurrency they al low and the amount of
overhead that they incur
• Testing a schedule for serializability after it has
executed is a little too late!
– Tests for seria liza bility hel p us understa nd why a
concurrency control protocol is correct
• Goal – to develop concurrency control protocols that
wil l assure serializability.
Wea k Levels of Consistency
• Some a pplications a re wil ling to live with
wea k levels of consistency, a l lowing
schedu les that a re not seria liza ble
– E.g., a read-only tra nsaction that wa nts to
get a n a pproximate tota l ba la nce of a l l
accounts
– E.g., data base statistics computed for query
optimization ca n be a pproximate (why?)
– Such tra nsactions need not be seria liza ble
with respect to other tra nsactions
• Tradeoff accu racy for performa nce
Levels of Consistency in SQL-92
• Seria liza ble — default
• Repeata ble read — only committed records to be read, repeated reads of sa me record
must return sa me va lue. However, a tra nsaction may not be seria liza ble – it may find
some records inserted by a tra nsaction but not find others.
• Read committed — only committed records ca n be read, but successive reads of record
may return different (but committed) va lues.
• Read uncommitted — even uncommitted records may be read.

■ Lower deg rees of consistency useful for gathering a pproximate


information a bout the data base
■ Wa rning: some data base systems do not ensure seria liza ble schedules by default
● E.g., Oracle a nd Postg reSQL by default support a level of consistency ca l led
sna pshot isolation (not pa rt of the SQL sta nda rd)
Tra nsaction Definition in SQL
• Data manipulation language must include a
construct for specifying the set of actions that
comprise a transaction.
• I n SQL, a transaction begins implicitly.
• A transaction in SQL ends by:
– Commit work commits current tra nsaction a nd begins
a new one.
– Rol l back work causes current tra nsaction to a bort.
• I n almost al l database systems, by default, every
SQL statement also commits implicitly if it
executes successful ly
– I mplicit commit ca n be turned off by a data base
directive
• E.g. in J DBC, connection.setAutoCommit(false);
Other Notions of Seria liza bility
View Seria liza bility
• Let S a nd S´ be two schedules with the sa me set of tra nsactions. S a nd S´ a re
view equiva lent if the fol lowing three conditions a re met, for each data item Q,
1. If in schedule S, tra nsaction Ti reads the initia l va lue of Q, then in schedule
S’ a lso tra nsaction Ti must read the initia l va lue of Q.
2. If in schedule S tra nsaction Ti executes read(Q), a nd that va lue was
produced by tra nsaction Tj (if a ny), then in schedule S’ a lso tra nsaction Ti
must read the va lue of Q that was produced by the sa me write(Q) operation
of tra nsaction Tj .
3. The tra nsaction (if a ny) that performs the fina l write(Q) operation in schedule
S must a lso perform the fina l write(Q) operation in schedule S’.

• As ca n be seen, view equiva lence is a lso based purely on reads a nd writes a lone .
View Seria liza bility (Cont.)
• A schedule S is view seria liza ble if it is view equiva lent to a seria l schedule.
• Every conflict seria liza ble schedule is a lso view seria liza ble.
• Below is a schedule which is view-seria liza ble but not conflict seria liza ble.

• What seria l schedule is a bove equiva lent to?


• Every view seria liza ble schedule that is not conflict seria liza ble has blind writes.
Test for View Seria liza bility
• The precedence g ra ph test for conflict seria liza bility ca nnot be used directly to test
for view seria liza bility.
– Extension to test for view seria liza bility has cost exponentia l in the size of the
precedence g ra ph.
• The problem of checking if a schedule is view seria liza ble fa l ls in the class of N P-
complete problems.
– Thus, existence of a n efficient a lgorithm is extremely unlikely.
• However ,practica l a lgorithms that just check some sufficient conditions for view
seria liza bility ca n stil l be used.
More Complex Notions of Seria liza bility
• The schedule below produces the sa me outcome as the seria l schedule < T1, T5 >,
yet is not conflict equiva lent or view equiva lent to it.

• If we sta rt with A = 1000 a nd B = 2000, the fina l result is 960 a nd 2040


• Determining such equiva lence requires a na lysis of operations other tha n read a nd
write.
End of Cha pter 14

You might also like