Module-4 Dbms Cs208 Notes
Module-4 Dbms Cs208 Notes
Module IV Notes
Transaction processing
What is a transaction?
o write_item(X): Writes the value of program variable X into the database item
named X.
KTU STUDENTS
Transactionstates:
o Active state
o Committed state
o Failed state
o Terminated State
Transaction operations
o read or write: These specify read or write operations on the database items that
are executed as part of a transaction.
o end_transaction: This specifies that read and write transaction operations have
ended and marks the end limit of transaction execution.
o rollback (or abort): This signals that the transaction has ended unsuccessfully, so
that any changes or effects that the transaction may have applied to the
database must be undone
I Isolation/Independence: the updates of a transaction must not be made visible to other transactions
until it is committed (solves the temporary update problem)
D Durability (or Permanency): if a transaction changes the database and is committed, the changes
must never be lost because of subsequent failure
The isolation property is enforced by the concurrency control subsystem of the DBMS
If every p( p ) transaction does not make its updates (write operations) visible to other
transactions until it is committed,
KTU STUDENTS
One form of isolation is enforced that solves the temporary update problem and eliminates
cascading rollbacks problem and eliminates cascading rollbacks
Level of isolation of a transaction
o A transaction is said to have level 0 ( ) zero isolation if it does not overwrite the dirty
reads of higher-level transactions.
o Level 1 (one) isolation has no lost updates,
o Level 2 isolation has no lost updates and no dirty reads. Level 3 isolation (also called
true isolation) has, in addition to level 2 properties, repeatable reads
CHARACTERIZING SCHEDULES
o schedule for a set of transactions must consist of all instructions of those transactions
o must preserve the order in which the instructions appear in each individual transaction.
Recoverable schedule:
o One where no transaction needs to be rolled back
o A schedule S is recoverable if no transaction T in S commits until all transactions T that
www.ktustudents.in
Strict Schedules
o A schedule in which a transaction can neither read or write an item X until the last
transaction that wrote X has committed has committed
Serial schedule: A schedule S is serial if, for every transaction T participating in the schedule, all
the operations of T are executed consecutively in the schedule. Otherwise, the schedule is called
nonserial schedule.
Serializable schedule: A schedule S is serializable if it is equivalent to some serial schedule of
the same n transactions.
KTU STUDENTS
Result equivalent: Two schedules are called result equivalent if they produce the same final
state of the database.
Conflict equivalent: Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules.
Conflict serializable: A schedule S is said to be conflict serializable if it is conflict equivalent to
some serial schedule S.
Being serializable is not the same as being serial Being serializable implies that the schedule is a
correct schedule. It will leave the database in a consistent state. The interleaving is
appropriate and will result in a state as if the transactions were serially executed, yet will
achieve efficiency due to concurrent execution.
View equivalence:A less restrictive definition of equivalence of schedules
View serializability:Definition of serializability based on view equivalence.
A schedule is viewserializable if it is viewequivalent to a serial schedule.
www.ktustudents.in
Example: Lock (X). Data item X is locked on behalf of the requesting transaction.
Unlocking is an operation which removes these permissions from the data item. Example: Unlock (X).
Data item X is made available to all other transactions. Lock and Unlock are Atomic operations.
Two-Phase Locking
Two phase locking is a process used to gain ownership of shared resources without
creating the possibility for deadlock. It breaks up the modification of shared data into
"two phases".
There are actually three activities that take place in the "two phase" update algorithm:
KTU STUDENTS
1. Lock Acquisition
2. Modification of Data
3. Release Locks
The modification of data, and the subesquent release of the locks that protected the
data are generally grouped together and called the second phase.
The resource (or lock) acquisition phase of a "two phase" shared data access protocol
is usually implemented as a loop within which all the locks required to access the
shared data are acquired one by one. If any lock is not acquired on the first attempt the
algorithm gives up all the locks it had previously been able to get, and starts to try to
get all the locks again.
www.ktustudents.in
The Two Phase Locking Protocol assumes that a transaction can only be in one of two phases.
Growing Phase: In this phase the transaction can only acquire locks, but cannot release any
lock. The transaction enters the growing phase as soon as it acquires the first lock it wants.
From now on it has no option but to keep acquiring all the locks it would need. It cannot release
any lock at this phase even if it has finished working with a locked data item. Ultimately the
transaction reaches a point where all the lock it may need has been acquired. This point is
called Lock Point.
Shrinking Phase: After Lock Point has been reached, the transaction enters the shrinking
phase. In this phase the transaction can only release locks, but cannot acquire any new lock.
The transaction enters the shrinking phase as soon as it releases the first lock after crossing the
Lock Point. From now on it has no option but to keep releasing all the acquired locks.
KTU STUDENTS
the concepts of Date Created or Last Modified properties of files and folders. Well, timestamps are things like that.
A timestamp can be implemented in two ways. The simplest one is to directly assign the current value of the clock to
the transaction or the data item. The other policy is to attach the value of a logical counter that keeps incrementing as
new timestamps are required.
The timestamp of a transaction denotes the time when it was first activated. The timestamp of a data item can be of
the following two types:
W-timestamp (Q): This means the latest time when the data item Q has been written into.
R-timestamp (Q): This means the latest time when the data item Q has been read from.
These two timestamps are updated each time a successful read/write operation is performed on the data item Q.
www.ktustudents.in
Each successful write results in the creation of a new version of the data item written.
When a read(Q) operation is issued, select an appropriate version of Q based on the timestamp
of the transaction, and return the value of the selected version.
KTU STUDENTS
locking nor time stamping techniques. Instead, a transaction is executed
without restrictions until it is committed. Using an optimistic approach, each
transaction moves through two or three phases, referred to as read,
validation, and write.
During the read phase, the transaction reads the database, executes the
needed computations, and makes the updates to a private copy of the
database values. All update operations of the transaction are recorded in a
temporary update file, which is not accessed by the remaining transactions.
During the validation phase, the transaction is validated to ensure that the
changes made will not affect the integrity and consistency of the database.
If the validation test is positive, the transaction goes to the write phase. If
the validation test is negative, the transaction is restarted and the changes
are discarded.
During the write phase, the changes are permanently applied to the
database.
www.ktustudents.in
The cost of implementing locks depends on the size of data items. There are two types of
lock granularity:
Fine granularity
Coarse granularity
Fine granularity refers for small item sizes and coarse granularity refers for large item
Sizes.
Here, Sizes decides on the basis:
a database record
a field value of a database record
KTU STUDENTS
a disk block
a whole file
the whole database
What is needed is a mechanism to allow the system to dene multiple levels of granularity.
We can make one by allowing data items to be of various sizes and dening a hierarchy of
www.ktustudents.in
Consider the tree of Figure, which consists of four levels of nodes. The highest level
represents the entire database. Below it are nodes of type area; the database consists of
exactly these areas. Each area in turn has nodes of type le as its children. Each area
contains exactly those les that are its child nodes. No le is in more than one area. Finally,
each le has nodes of type record. As before, the le consists of exactly those records that
KTU STUDENTS
are its child nodes, and no record can be present in more than one le.
Each node in the tree can be locked individually. As we did in the two-phase locking
protocol, we shall use shared and e xclusive lock modes. When a transaction locks a node,
in either shared or exclusive mode, the transaction also has implicitly locked all the
descendants of that node in the same lock mode. For example, if transaction Ti gets
an explicit lock on le Fc of Figure 16.16, in exclusive mode, then it has an implicit lock in
exclusive mode all the records belonging to that le. It does not need to lock the individual
records of Fc explicitly.
If a node is locked in an intention mode, explicit locking is being done at a lower level of
the tree (that is, at a ner granularity). Intention locks are put on all the ancestors of a node
before that node is locked explicitly. Thus, a transaction does not need to search the entire
tree to determine whether it can lock a node successfully.
There is an intention mode associated with shared mode, and there is one with exclusive
mode. If a node is locked in intention-shared (IS) mode, explicit locking is being done at a
lower level of the tree, but with only shared-mode locks. Similarly, if a node is locked
in intention-exclusive (IX) mode, then explicit locking is being done at a lower level, with
exclusive-mode or shared-mode locks. Finally, if a node is locked in shared and
intention-exclusive (SIX) mode, the subtree rooted by that node is locked explicitly in
shared mode, and that explicit locking is being done at a lower level with exclusive-mode
locks.
www.ktustudents.in
2. It must lock the root of the tree rst, and can lock it in any mode.
3. It can lock a node Q in S or IS mode only if it currently has the parent of Q locked in
either IX or IS mode.
4. It can lock a node Q in X, SIX, or IX mode only if it currently has the parent of Q locked in
either IX or SIX mode.
5. It can lock a node only if it has not previously unlocked any node (that is, Ti is two
phase).
6. It can unlock a node Q only if it currently has none of the children of Q locked.
KTU STUDENTS
Log Based Recovery: - In this method, log of each transaction is maintained in some
stable storage, so that in case of any failure, it can be recovered from there to recover the
database. But storing the logs should be done before applying the actual transaction on
the database.
Every log in this case will have informations like what transaction is being executed, which values
have been modified to which value, and state of the transaction. All these log information will be
stored in the order of execution.
There are two methods of creating this log files and updating the database
o Deferred database modification: - In this method, all the logs for the transaction is
created and stored into stable storage system first. Once it is stored, the database is
updated with changes. In the above example, after all the three log records are created
and stored in some storage system, database will be updated with those steps.
o Immediate database modification: - After creating each log record, database is modified
for each step of log entry immediately. In the above example, database is modified at
each step of log entry i.e.; after first log entry, transaction will hit the database to fetch the
record, then second log will be entered followed by updating the address, then the third
log followed by committing the database changes.
Shadow paging: - This is the method where all the transactions are executed in the
primary memory. Once all the transactions completely executed, it will be updated to the
database. Hence, if there is any failure in the middle of transaction, it will not be reflected
www.ktustudents.in
KTU STUDENTS
Shadow paging
www.ktustudents.in
KTU STUDENTS
Inamultiuserenvironment,alogmaybeneededfortheconcurrencycontrolmethod.
Shadow paging considers the database to be made up of a number of fixed-size disk
pages. Adirectorywithnentries is constructed, where the ithentry points to the
ithdatabasepageondisk.
The directory is kept in main memory if it is not too large, and allreferencesreadsor
writestodatabasepagesondiskgothroughit.
When a transaction begins executing, thecurrent directorywhose entries point to
the most recent or current database pages on diskis copied into ashadow
directory.The shadow directory is then saved on disk while the current directory is
usedbythetransaction.
Duringtransactionexecution,theshadowdirectoryisnevermodified.
When a write_item operation is performed, a new copy of the modified database page
is created, but the old copy of that page isnot overwritten.Instead, the new page is
writtenelsewhereonsomepreviouslyunuseddiskblock.
The current directory entry is modified to point to the new disk block, whereas the
shadow directory is not modified and continues to point to the old unmodified disk
block. For pages updated by the transaction, two versions are kept. The old version is
referencedbytheshadowdirectory,andthenewversionbythecurrentdirectory.
To recover from a failure during transaction execution, it is sufficient to free the
modified database pages and to discard the current directory. The state of the
database before transaction execution is available through the shadow directory, and
that state is recovered by reinstating the shadow directory. The database thus is
www.ktustudents.in
1. An undo-only log record: Only the before image is logged. Thus, an undo
operation can be done to retrieve the old data.
2. A redo-only log record: Only the after image is logged. Thus, a redo operation
KTU STUDENTS
can be attempted.
3. An undo-redo log record. Both before image and after images are logged.
Every log record is assigned a unique and monotonically increasing log sequence
number (LSN). Every data page has a page LSN field that is set to the LSN of the log
record corresponding to the last update on the page. WAL requires that the log
record corresponding to an update make it to stable storage before the data page
corresponding to that update is written to disk. For performance reasons, each log
write is not immediately forced to disk. A log tail is maintained in main memory to
buffer log writes. The log tail is flushed to disk when it gets full. A transaction cannot
be declared committed until the commit log record makes it to disk.
Once in a while the recovery subsystem writes a checkpoint record to the log. The
checkpoint record contains the transaction table (which gives the list of active
transactions) and the dirty page table (the list of data pages in the buffer pool that have
not yet made it to disk). A master log record is maintained separately, in stable
storage, to store the LSN of the latest checkpoint record that made it to disk. On
restart, the recovery subsystem reads the master log record to find the checkpoint's
LSN, reads the checkpoint record, and starts recovery from there on.
www.ktustudents.in
1. Analysis. The recovery subsystem determines the earliest log record from
which the next pass must start. It also scans the log forward from the
checkpoint record to construct a snapshot of what the system looked like at
the instant of the crash.
2. Redo. Starting at the earliest LSN determined in pass (1) above, the log is read
forward and each update redone.
3. Undo. The log is scanned backward and updates corresponding to loser
transactions are undone.
KTU STUDENTS
protection: security, integrity and availability. What is the content of those words?
At this moment we have basic image of information system security and we can take a
look at concrete aspects that should be covered with DBMS security mechanisms.
www.ktustudents.in
At least five aspects from the previous list must be ensured with special techniques
KTU STUDENTS
that do not exist in unsecure DBMSs. There are three basic ways to do it:
flow control - we control information flows in frame of DBMS
inference control - control of dependencies among data
access control - access to the information in DBMS is restricted
1. Legal and ethical issues regarding the right to access certain information. Some information may be
deemed to be private and cannot be accessed by unauthorized persons.
2. Policy issues at the governmental, institutional, or corporate level as to what kinds of information
should not be made publicly available for example credit ratings.
3. System related issues such as system levels at which various security functions should be enforced for
example whether a security function should be handled at the physical hardware level, operating system
level and the DBMS level.
4. The need in some organization to identify multiple security levels and to categorize the data and
users based on these classifications for example top secret, secret, confidential, and unclassified.
www.ktustudents.in
1. Loss of integrity Database integrity refers to the requirement that information be protected from
improper modification includes creation, insertion, modification, changing the status of data, and
deletion. Integrity lost if authorized changes are made to the data by either intentional or accidental
acts. If the loss of system or data integrity is not corrected, continued use of the contaminated system or
corrupted data would result in inaccuracy, fraud or erroneous decisions.
2. Loss of availability Database availability refers to making objects available to a human user or a
program to which they have a legitimate right.
3. Loss of confidentiality Data confidentiality refers to the protection of data from unauthorized
disclosure. The impact is of confidential information can range from violation of data privacy act.
Unauthorized, unanticipated or unintentional disclosure could result in loss of public confidence, or legal
action against the organization.
Database Security Control Measures There are four main control measures used to provide security of
data in databases. They are :
1. Access control The security mechanism of a DBMS must include provisions for restricting access to
KTU STUDENTS
the database as a whole. This function is called access control and is handled by creating user accounts
and passwords to control the login process by the DBMS.
2. Inference control Statistical databases are used to provide statistical information or summaries of
values based on various criteria. Security for statistical databases must ensure that information about
individuals cannot be accessed. It is possible to deduce or infer certain facts concerning individuals from
queries that involve only summary statistics on groups, consequently this must not permitted either.
This problem called statistical database security and corresponding control measures are called
inference control measures.
3. Flow control It prevents information from flowing in such a way that it reaches unauthorized users.
Channels that are pathways for information to flow implicitly in ways that violate security policy of an
organization are called covert channels.
4. Data encryption It is used to protect sensitive data that is transmitted via some type of
communication network. Encryption can be used to provide additional protection for sensitive portions
of a database. The data is encoded using some coding algorithm. An unauthorized user who access
encoded data will have difficulty deciphering it, but authorized users are given decoding or decryption
algorithms to decipher data. Encrypting techniques are very difficult to decode without a key have been
developed for military applications.
www.ktustudents.in
You grant privileges to users so these users can accomplish tasks required for their
job. You should grant a privilege only to a user who absolutely requires the privilege
to accomplish necessary work. Excessive granting of unnecessary privileges can
compromise security. A user can receive a privilege in two different ways:
You can grant privileges to users explicitly. For example, you can explicitly
grant the privilege to insert records into the EMP table to the user SCOTT.
You can also grant privileges to a role (a named group of privileges), and then
grant the role to one or more users. For example, you can grant the privileges to
select, insert, update, and delete records from the EMP table to the role named
CLERK, which in turn you can grant to the users SCOTT and BRIAN.
KTU STUDENTS
Because roles allow for easier and better management of privileges, you should
normally grant privileges to roles and not to specific users.
system privileges
schema object privileges
System Privileges
A system privilege is the right to perform a particular action, or to perform an action
on any schema objects of a particular type. For example, the privileges to create
tablespaces and to delete the rows of any table in a database are system privileges.
There are over 60 distinct system privileges.
Granting and Revoking System Privileges
You can grant or revoke system privileges to users and roles. If you grant system
privileges to roles, you can use the roles to manage system privileges (for example,
roles permit privileges to be made selectively available).
www.ktustudents.in
You grant or revoke roles from users or other roles using the following options:
KTU STUDENTS
www.ktustudents.in