DBMS Material
DBMS Material
Data & Information: - Raw facts and figures with no meaning by itself are called data. Data
does not have any meaning until it is processed and turned into information. The result of data
processing is information.
Data elements (or fields) are the lowest unit of meaning information. A record is a logical
collection of data items. Data records may contain one or more fields. A file is a logical
collection of records.
Data dictionary is a data structure that stores meta-data. Data dictionary is a table in a database
that stores the names, field, types, length and other characteristics of the fields in the database
tables.
DBMS is a collection of interrelated data and a set of programs to access, update and manage
those data.
Applications of Database System: Banking, Airlines/railways, Universities, Credit card
transactions
The collection of information stored in the database at a particular moment is called an instance
of the database. The overall logical design of the database is called the database schema.
Database Languages
(2)Data Manipulation Language – DML is a language that enables user to access, modify,
delete or retrieve data. DML specifies following commands:
Insert - Inserts new data into a table
Update - Updates or modifies existing data into a table
Delete - Deletes records from a table
Select - This command is used to retrieves data from a table
The system hides certain details of how the data are stored and maintained. This is called data
abstraction. There are three levels of data abstraction:
(1) Physical Level: The lowest level of abstraction describes how the data are actually stored.
(2) Logical Level: The next higher level of abstraction describes what data are stored in the
database and what relationships exist among those data.
(3) View Level: The highest level of abstraction describes only part of the entire database.
For example, consider customer table with fields customer_name , customer_id and address. At
the physical level, a customer record can be described as bytes. The language compiler hides this
level of details from programmers. At the logical level, each record is specified by type
definition:
Struct customer
{
int customer_id;
char customer_name[20];
char customer_address[30];
};
Finally at the view level, computer users see a set of application programs that hide details of the
data types.
Data Independence
Each higher level of the data architecture is immune to changes of the next lower level of the
architecture.
(1) Physical Data Independence: is the ability to modify the physical schema without causing
application programs to be rewritten. Modifications at the physical level are occasionally
necessary to improve performance. It means we change the physical storage without
affecting the conceptual or external view of the data.
(2) Logical Data Independence: is the ability to modify the logical schema without causing
application programs to be rewritten. Modifications at the logical level are necessary
whenever the logical structure of the database is altered. Logical data independence
means if we add some new columns or remove some columns from table then the user
view and programs should not change.
(3) View Level Data Independence: always independent because there doesn’t exist any
other level above view level.
Database Administrator
A person who has central control over the system is called database administrator. The functions
of a DBA include the following:
(1) Schema definition – The DBA creates the original database schema by executing a set of
DDL statements.
(2) Schema modification – The DBA carries out changes to the schema.
(3) Granting of authorization for data access – DBA can regulate different users accessing
different parts of database.
(4) Routing maintenance - (1) Periodic backup to prevent loss of data in case of disasters (2)
Ensuring that performance of the system is not degraded and (3) Ensuring that enough
free disk space is available for normal operations.
(1) Storage Manager – The storage manager is responsible for storing, retrieving and
updating data in the database. The various components of storage manager are:
• Transaction Manager – It ensures that database remains in consistent state.
• Authorization and Integrity Manager – It tests for satisfaction of various integrity
constraints and checks the authority of the users accessing the data.
• Buffer Manager – It is responsible for fetching data from disk storage.
• File Manager – It manages the allocation of space on disk.
(2) Query Processor – It includes the following components:
• DDL interpreter – which interprets DDL statements.
• DML compiler – which translate DML statements into low level languages.
• Query evaluation engine – which executes low level instructions generated by the
DML compiler.
(3) Database Users – There are four different types of database users:
• Naïve users – They are unsophisticated users who interact with the system by invoking
one of the application programs that have been written previously.
• Application programmers – They are computer professionals who write application
programs.
• Sophisticated users – They interact with the system without writing programs. Instead,
they form their requests in a database query language.
• Specialized users – They write special applications such as knowledge base and expert
systems.
Unit 2
Data Models
An entity is a real world object. For example, student, customer etc. Entities are described in a
database by a set of attributes. Relationship is an association among several entities. An E-R
diagram graphically represent overall logical structure of a database.
Major components of an E-R diagram are as follow:
Rectangles – which represents entity sets
Ellipses – which represents attributes
Diamonds – which represent relationship sets
Lines – which link attributes to entity sets and entity sets to relationship sets.
Double ellipses – which represent multi-valued attributes.
Dashed ellipses – which denote derived attributes.
Double lines – which indicate total participation of an entity, in a relationship set.
Double rectangle – which represents weak entity set.
Types of Attributes
(1) Single and Composite attributes – Single attributes are not divided into subparts, are
simple. Composite attributes can be divided into subparts. For example, an attribute name
can be divided into first_name, middle_name and last_name. An attribute address can be
divided into customer_street, customer_city, state and zip code.
(2) Single-valued and multi-valued attributes – The loan_number attribute for a specific loan
entity refers to only one loan number. Such attributes are said to be single valued. An
employee entity set with attributes phone_no is a multi-valued attribute.
(3) Derived attributes – The value for this type of attribute can be derived from the values of
other related attributes. For example, an attribute age can be derived from date_of_birth
and the current_date.
Keys
(1) Super key – A super key is a set of one or more attributes that, taken collectively allow
us to identify uniquely an entity in the entity set. For example, customer_id attribute of
the entity set customer is sufficient to distinguish one customer entity from another. Thus,
customer_id is a super key. Similarly, combination of customer_name and customer)id is
a super key for the entity set customer. The customer_name attribute of customer is not a
super key, because several people might have the same name.
(2) Candidate key – Super key for which no proper subset is a super key is called candidate
key. Minimal super key is called candidate key. Both {customer_id} and
{customer_name, customer_address} are candidate keys. {customer_id,
customer_name}does not form a candidate key.
(3) Primary key – All candidate keys are equally eligible to become a primary key. Primary
key is unique and not null.
Aggregation
An aggregation allows us to model relationship with relationship. It is like structure within
structure. Aggregation is like a nested structure. Suppose we want to model “the set of books
used by a teacher to teach a subject.”
Mapping Cardinalities
It represents the number of entities to which another entity can be associated via a relationship
set. The mapping cardinalities must be one of the following:
(1) One-to-one - An entity in A is associated with at most one entity in B and an entity in B
is associated with at most one entity in A.
(4) Many-to-many :- An entity in A is associated with any number (zero or more) of entities
in B, and an entity in B is associated with any number (zero or more) of entities in A.
Relational Algebra
Operations: Select, Project, Cartesian product, Union, Set difference, Natural Join.
Set of relations -
(1)Select(Ϭ) Operation –
The purpose is to identify a set of tuples which is part of relation and to extract only those tuples
out.
For example, Find out information of all books in the library having yr_pub =1992.
Ϭyr_pub=1992(book)
(2)Project(Π) Operation -
For example, Find out title of all books in the library which have year of publication after 1992
and list their title.
A B
a 1
b 2
a 2
B C
3 1a
2 2b
r×s-
Find the accession no of all books which are available in the library.
Πacc_no(book) - Πacc_no(borrow)
Union operation –
Find out all books which are either issued or have been supplied by a supplier.
Natural join is useful where there is some commonality between attributes. Consider the
following relations r and s:
A B C
a b c
d e f
g h i
B D
b q
p r
e t
r⨝s-
A B C D
a b c q
d e f t
r-
A B C
a b b
A c d
P q r
s-
B C D
q s t
b c m
q r s
b r m
r⨝s–
A B C D
a b c m
p q r s
Left outer join (⟕), Right outer join (⟖) and Full outer join (⟗) –
The result of left outer join is the set of all combinations of tuples in R and S that are equal on
their common attributes names, in addition to tuples in R that have no matching tuples in S.
The result of right outer join is the set of all combinations of tuples in R and S that are equal on
their common attributes names, in addition to tuples in S that have no matching tuples in R.
Employee
Dept
Dept_name Manager
Sales Rao
Production Bhaskar
Employee ⟕ Dept (Left outer join)
Functional Dependency
X →Y means
• Y is functionally dependent on X
• X functionally determines Y.
Consider the relation book(acc_no, yr_pub, title) and the FDs, acc_no → yr_pub.
X Y Z W
x1 y1 z1 w1
x1 y2 z1 w2
x2 y2 z2 w2
x2 y3 z2 w3
x3 y3 z2 w4
αα is same as α, α →ßα
α → τ, ßα →ßτ (augmentation)
α →ßγ holds.
α →ß (transitivity)
α →γ (transitivity)
α →ß
γα→ γß
Compute the closure of the following set of FDs for relation R= (A, B, C, D, E)
Normalization
It is the process to prevent redundant data. The purpose id to simplify the data.
1 NF – The relation is in 1NF if it has no repeating groups. All attributes are simple.
Complex attributes are like structure within a stricture. For example, complex attributes, I define
like this,
Name Address
Table 1
Table 1 represents nested definition of attributes Name and Address. Therefore, the given
relation is not in 1NF.
Title F_Name Street City State
Table 2
Table 2 represents no nested attributes. Therefore, the given relation is in 1NF. 1NF does not
check functional dependency.
2NF -
Loan_sch(c_name, loan_no, amount) for single branch and let us have c_name, loan_no →
amount
Non prime attribute – an attribute which is not an element of any candidate key.
Here, alone loan_no does not imply amount. Alone c_name does not imply amount. Therefore,
amount is fully functionally dependent on candidate key.
3NF –
A relation is in 3NF if it is in 2NF and no non prime attribute is functionally dependent on other
non prime attributes.
A relation R is in BCNF if for all FDs, α →ß, which hold on R, then α must be a super key.
Loan_sch(c_name, loan_no, amount) for single branch and let us have c_name, loan_no →
amount
4NF -
Given a relation R with attributes A, B and C, the multi-valued dependency, R.A→R.B holds in
R if and only if the set of B-values depend on the A-values and is independent on the C-values.
MVD exists only if the relation R has at least three attributes. Given a relation R(A,B,C), MVD
R.A→R.B holds iff MVD R.A→R.C also holds. MVDs are always in pairs. It is generally
expressed as R.A→R.B/R.C.
Vehicle Model
Maruti Maruti 800
Maruti Baleno
Scooter Bajaj
Scooter LML
VDM.Vehicle → VDM.Dealer
VDM.Vehicle →VDM.Model
5NF
There are some relations, which cannot be decomposed into two projections. But these relations
can be decomposed into three projections, this is in 5NF.
Dealer Parts
D1 P1
D1 P2
D2 P1
Table 2 [DP relation]
Parts Customer
P1 C1
P1 C2
P2 C1
Table 3 [PC relation]
Customer Dealer
C1 D1
C2 D1
C1 D2
Table 4 [CD relation]
Table 5 contains one extra tuple. Here, by joining Table-2 and Table-3, we are not getting our
original relation. If we join Table-5 with Table-4, then we get the following table:
We decompose the relation into three projections as shown in tables. When we join Table-2,
Table-3 and Table-4, we get the original relation as in Table-1.
DPC is obtainable if all the three projections are joined. Two projections if joined do not give
back the original relation. This is called join dependency.
Show that the following decomposition is not a loss-less decomposition:
A B C D E
a1 b1 c1 d1 e1
a2 b2 c2 d2 e2
A B C
a1 b1 c1
a2 b2 c1
C D E
c1 d1 e1
c1 d2 e2
A B C D E
a1 b1 c1 d1 e1
a1 b1 c1 d2 e2
a2 b2 c1 d1 e1
a2 b2 c1 d2 e2
Query processing refers to the range of activities involved in extracting data from a database.
(1)Parsing and translation: - The first step is to translate a given query into its internal form. The
parser checks the syntax of the user’s query, verifies that the relation names appearing in the
query are names of the relations in the database, and so on. Then it is translated into a relational
algebra expression.
(2)Optimization:- Query optimization is the process of selecting the most efficient query
evaluation plan from among the many strategies usually possible for processing a given query.
(3)Evaluation engine: - A sequence of primitive operations that can be used to evaluate a query is
a query evaluation plan. The query evaluation engine takes a query evaluation plan, executes that
plan and returns the answer to the query. Fig. 5.2 illustrates an evaluation plan for our query.
Πbalance
|
Ϭbalance<2500
|
account
Fig. 5.2 A query evaluation plan
Evaluation of Expressions
There are two approaches for expression evaluation: materialization approach and pipelining
approach.
Πcustomer-name
|
⨝
/ \
Ϭbalance<2500 customer
|
account
Query Optimization
Query optimization is the process of selecting the most efficient query evaluation plan from
among the many strategies usually possible for processing a given query. We do not expect
users to write their queries so that they can be processed efficiently. Rather, we expect the
system to construct a query evaluation plan that minimizes the cost of query evaluation. This is
where query optimization comes into play.
Consider the relational algebra expression for the query “Find the names of all customers who
have an account at any branch located in Unjha.”
which is equivalent to our original algebra expression, but which generates smaller intermediate
relations.
Unit 6
Storage Strategies
Indices – An index is a data structure that organizes data records on the disk to make the
retrieval of data efficient. There are two types of indices:
(2) Hash Indices – is based on hash address generated using hash functions.
(1) Ordered Indices – An index on a set of fields that includes the primary key is called a
primary index.
For example, if we want to search a record for roll no 18CE32, we need not have to search for
entire data file. With the help of primary index structure, we come to know the location of the
record containing the roll no 18CE30, then we can easily locate the entry for 18CE32.
In order to identify the records faster, we will group two or more columns together to get the
unique values and create index out of them. This method is known as clustering index.
For example, students studying in each semester are grouped together. 1st sem students, 2nd sem
students, 3rd sem students etc are grouped.
Dense Index – An index record appears for every search key value in file.
Sparse Index – Index records are created only for some of the records.
(3)Single & Multilevel Indices –
The single index file occupies considerably less disk blocks than the data file.
Multilevel index helps in breaking down the index into several smaller indices in order to make
outermost level so small, which can easily be accommodated anywhere in the main memory.
B-Tree
B-Tree is a specialized multi way tree used to store the records in a disk. The goal is to get fast
access of data. B-tree of order m is a tree which satisfies the following properties:
Hashing
Hashing is a technique to directly search the location of desired data on the disk without using
index structure.
A hash function is a simple mathematical function. The example of a hash function is a book call
number. Each book in the library has a unique call number. A call number is like an address, it
tells us where the book is located in the library.
(1)Static Hashing – In this method, the resultant data bucket address will always remain same.
Static hashing is divided into open hashing and close hashing.
In open hashing, instead of overwriting older one the next available data block is used to enter
the new record. In close hashing, when buckets are full, a new bucket is allocated for the same
hash and result are linked after the previous one.
(2)Dynamic Hashing – It offers a mechanism in which data buckets are added and removed
dynamically and on demand.
Collision - Hash collision is a state when the resultant hashes from two or more data in the data
set, wrongly map the same place in the hash table.
Hash table – is a data structure used for storing and retrieving data quickly.
Hash function – is a function used to place data and to retrieve data from hash table.
Transaction – Collection of operations that forms a single logical unit of work is called
transaction.
Ti: read(A);
A=A-50;
write(A);
read(B);
B=B+50;
write(B);
Atomicity – Either all operations should be reflected in the database or none are. Suppose initial
value of account A = 1000 and B = 2000. The task is to transfer Rs. 50 from account A to
account B. After execution of write(A) instruction, suppose power failure occur or system failure
due to any hardware or software error. Therefore, Rs. 50 is debited from account A, but not
credited to account B.
Isolation – Transaction should be executed serially. Even though multiple transactions executed
concurrently, each transaction is not aware about another transaction executing in the system.
Suppose, three transactions Ti, Tj and Tk executing in the system concurrently, but Ti is not
aware about Tj and Tk. Similarly, Tj is not aware about Ti and Tk and Tk is not aware about Ti
and Tj.
Durability - After successful completion of a transaction whatever changes it has made to the
database those changes remain persist even after power failure occurs.
State Diagram of a transaction –
Active – the initial state, the transaction stays in this state while it is executing.
Partially committed – after the final statement has been executed.
Failed – after the discovery that normal execution can no longer proceed.
Aborted – after the transaction has been rolled back and the database has been restored to its
state prior to the start of the transaction.
Committed – after successful completion.
(1)Improved throughput and resource utilization – I/O activity can be done in parallel with
processing at the CPU. While one transaction is reading from the disk, another transaction is
doing some modification in the main memory and third transaction is updating the value in the
disk. All of this increases the throughput of the system- that is no. of transactions executed in a
given amount of time.
(2)Reduced waiting time – If a transaction runs serially, a short transaction may have to wait for
a preceding long transaction to complete, which can lead to unpredictable delays in running a
transaction.
Serializability –
Two types of serializability – (1) Conflict serializability and (2) View serializability.
(1)Conflict serializability
T1 T2
read(A)
write(A)
read(A)
read(B)
write(A)
write(B)
read(B)
write(B)
T1 T2
read(A)
write(A)
read(B)
read(A)
write(A)
write(B)
read(B)
write(B)
T1 T2
read(A)
write(A)
read(B)
read(A)
write(B)
write(A)
read(B)
write(B)
Swap write(B) instruction of T1 with read(A) instruction of T2.
T1 T2
read(A)
write(A)
read(A)
write(A)
read(B)
write(B)
read(B)
write(B)
The schedules S and S’ are said to be view equivalent if three conditions are met:
(1) For each data item Q, if transaction Ti reads initial value of Q in schedule S, then
transaction Ti must, in schedule S’ also read the initial value of Q.
(2) For each data item Q, if transaction Ti executes read(Q) in schedule S and if that value
was produced by a write(Q) operation executed by transaction Tj, then the same scenario
must occur in schedule S’.
(3) For each data item Q, the transaction that performs the final write(Q) operation in
schedule S, must perform the final write(Q) operation in schedule S’.
Schedule 1 is not view equivalent to schedule 2. Since, in schedule 1, the value of account A
read by transaction T2 was produced by T1, whereas this case does not hold in schedule 2.
However, schedule 1 is view equivalent to schedule 3 because the values of account A and B
read by transaction T2 were produced by T1 in both schedules.
Every conflict serializable schedule is also view serializable, but there are view serializable
schedules that are not conflict serializable. Blind writes (read without writes) appear in any view
serializable schedule that is not conflict serializable.
Recoverable Schedule
T1 T2
read(A)
write(A)
read(A)
read(B)
Suppose that the system allow T2 to commit immediately after executing the read(A) instruction.
Thus T1 commits before T2 does. Now suppose that T1 fails before it commits. Since T2 has
read the value of data item A written by T1, we must abort T2. However, T2 has already
committed and can not be aborted. Thus we have situation where it is impossible to recover
correctly from the failure of T1. This is an example of non recoverable schedule, which should
not be allowed.
A recoverable schedule is one where for every pair of transaction Ti and Tj such that Tj reads a
data item previously written by Ti, the commit operation of Ti appears before the commit
operation of Tj.
Cascading Rollback
T1 T2 T3
read(A)
read(B)
write(A)
read(A)
write(A)
read(A)
Transaction T1 writes a value of A that is read by transaction T2. T2 writes a value of A that is
read by T3. Suppose that, at this point, T1 fails. T1 must be rolled back. Since T2 is dependent
on T1, T2 must be rolled back. Since T3 is dependent on T2, T3 must be rolled back. A single
transaction failure leads to a series of transaction rollbacks is called cascading rollback.
Cascadeless schedule is one where, for each pair of transaction Ti and Tj, such that Tj reads a
data item previously written by Ti, the commit operation of Ti appears before the read operation
of Tj.
(1) Lost update problem – The update of one transaction is overwritten by another
transaction.
Suppose, T1 credits $100 to account A and T2 debits $50 from account A. The initial
value of account A =500. If credits and debits are applied correctly, then final correct
value of account should be 450. We run T1 and T2 concurrently as follow:
T1(credit) T2(debit)
read(A) {A=500} read(A) {A=500}
A = A-100 {A=600} A = A-100 {A=450}
write(A) {A=600} write(A) {A=450}
Final value of A =450. The credits of T1 is missing (lost update) from the account.
T1(credit) T2(debit)
read(A) {A=500}
A = A-100 {A=600}
write(A) {A=600}
read(A) {A=600}
A = A-100 {A=700}
write(A) {A=700}
T1 failed to commit
T1 modified A = 600, T2 read A=600, But T1 failed and its effect is removed from the
database. Therefore, A is restored to its old value, A=500. A =600 is a non-existence
value but read by T2.
Locks
To allow a transaction to access a data item only if it is currently holding a lock on that item.
Two types of locks:
(1) Shared – If a transaction Ti has obtained a shared mode lock (S) on item Q, then Ti can
read, but cannot write Q.
(2) Exclusive – If a transaction Ti has obtained an exclusive mode lock(X) on item Q, then
Ti can both read and write Q.
S X
S ✔ ×
X × ×
Two phase locking protocol ensures serializability. This protocol requires that each transaction
issues lock and unlock requests in two phases:
(1) Growing phase – A transaction may obtain locks, but may not release any lock.
(2) Shrinking phase – A transaction release locks, but may not obtain any new locks.
Transaction T3 and T4 are two phase. Transaction T1 and T2 are not in two phase.
Two phase locking does not ensure freedom from deadlock. It also suffers from cascading
rollback. Cascading rollback can be avoided by modification of two phase locking called strict
two phase locking protocol. This protocol requires that not only locking be two phase, but also
all exclusive mode locks, taken by a transaction be held until the transaction commit.
Variation of two phase locking is the rigorous two phase locking protocol, which requires that all
locks be held until the transaction commits.
Upgrading can take place in only the growing phase and downgrading can take place in only the
shrinking phase.
Deadlock handling
A system is in a deadlock state if there exists a set of transactions such that every transaction in
the set is waiting for another transaction in the set. There exists a set of waiting transactions {T0,
T1,….Tn} such that T0 is waiting for a data item that T1 holds, T1 is waiting for a data item that
T2 holds, Tn is waiting for a data item that T0 holds. None of the transactions can make progress
in this situation.
Two methods for dealing with the deadlocks: Deadlock Prevention is to ensure that the system
will never enter into the deadlock state. Alternatively, we allow the system to enter into the
deadlock state and then try to recovery from this situation using deadlock detection and recovery
scheme.
(1)Deadlock Prevention –
First approach concerned with no cyclic waits can occur. Each transaction locks all data items
before it starts execution. It is often hard to predict before transaction begins, what data items
need to be locked. Data item utilization may be very low, since many of the data items may be
locked but unused for a long time.
Second approach concerned with transaction rollback. There are two methods:
Wait-die scheme (Non preemptive) – When a transaction Ti requests a data item currently held
by Tj, Ti is allowed to wait only if it has a timestamp smaller than that of Tj, otherwise Ti is
rollback.
Suppose transactions T2, T3, T4 have timestamps 5, 10 and 15. If T2 requests a data item held
by T3, then T2 will wait. If T4 requests a data item held by T3, then T4 will be rolled back.
Wound wait scheme (Preemptive) – When transaction Ti requests a data item currently held by
Tj, Ti is allowed to wait only if it has a timestamp larger than that of Tj, otherwise Tj is rolled
back.
If T2 requests a data item held by T3, then data item will be preempted from T3 and T3 will be
rolled back. If T4 requests a data item held by T3, then T4 will wait.
(2)Deadlock Detection –
Deadlock can be described in terms of directed graph known as wait for graph.
If transaction T25 is waiting for transaction T26 and T27, transaction T27 is waiting for
transaction T26 and transaction T26 is waiting for transaction T28.
For recovery from any type of failure, data values prior to modification and the new values after
modification are required. These values and other information is stored in a sequential file called
transaction log. The log must reside in stable storage. Recovery techniques are as follow:
Log Database
T0: read(A) <T0, start>
A=A-50 <T0, A, 950>
write(A)
read(B)
B=B+50 <T0, B, 2050>
write(B) <T0, commit> A=950
B=2050
Suppose, system crash, we consult the log file and we have found the log records as shown in
fig. (a). Here redo operation is not required, since no commit record appears in the log.
For log records as shown in fig, (b), redo(T0) is performed, since the record <T0 commit>
appears in the log on the disk.
For log records as shown in fig, (c), redo(T0) and redo(T1) is performed.
undo(Ti) – restores the value of all data items updated by transaction Ti to the old values.
redo(Ti) – sets the value of all data items updated by transaction Ti to the new values.
Log Database
<T0, start>
<T0, A, 1000, 950>
<T0, B, 2000, 2050>
A=950
B=2050
<T0, commit>
Suppose, system crash, we consult the log file and we have found the log records as shown in
fig. (a). Here <T0, start> record is there, but no corresponding <T0, commit> record. Therefore,
undo(T0) is performed.
For log records as shown in fig, (b), redo(T0) and undo(T1) is performed.
For log records as shown in fig, (c), redo(T0) and redo(T1) is performed.
Significance of Checkpoints
When a system failure occurs, we must consult to the log file to determine those transactions
need to be redone and those need to be undone. We need to search the entire log to determine
this information. The search process is time consuming. To reduce these overhead, checkpoints
was introduced.
Consider the set of transactions, {T0, T1, ….T100} executed in the order. Suppose that the
most recent checkpoint took place during the execution of the transaction of the transaction T67.
Thus only transaction T67, T68, …….T100 need to be consider during the recovery scheme.
Each of them needs to be redone if it has committed, otherwise it needs to be undone.
Unit 8
Database Security
Security Requirements
Encryption is the process of converting plaintext into cipher text (using encryption algorithm and
key). Decryption is the process of converting cipher text into plaintext (using decryption
algorithm and key).
Encryption is used to protect data in transmit, for example, data being transferred via networks,
mobile telephones, wireless microphones, Bluetooth devices and so on. Encryption can protect
the confidentiality of message. For example, encrypting the word “secret” with an alphabet
shifted by 3 letters to the right produce ‘vhfuhw’. A substitution cipher simply exchanges one
letter with another. This algorithm is called ‘Caesar cipher’.
In a multiple user environment, it is important that restrictions are placed in order to ensure that
people can only access what they need. MAC and DAC are two popular access control methods.
(1) The main difference between them is in how they provide access to users. MAC provides
access based on levels while DAC provides access based on identity. With MAC, admin
creates a set of levels and each user is linked with a specific level.
(2) A good example of a MAC is the access levels of windows for admin, ordinary users and
guests. For DAC, the permissions for Linux operating system is a good example.
(3) MAC is an easier way in establishing and maintaining access, especially when dealing
with a great number of users because you just need to establish a single level for each
resource. With DAC, you need to know each person who needs the resource.
(4) In DAC, access control list (ACL) is used, while in MAC, security labels are used.
(5) DAC is less secure compared to MAC.
• It is based on the concept that privileges and other permissions are associated with
organizational roles, rather than individual users. Individual users are then assigned to
appropriate roles.
• For example, an accountant in a company will be assigned to the accountant role, gaining
access to all the resources permitted for all accountants on the system. Similarly, a
software engineer might be assigned to the developer role.
• In an RBAC system, the roles are centrally managed by the administrator. The
administrator determines what roles exist within their companies and then map these
roles to job functions and tasks.
Intrusion Detection
Intrusion detection system detects the malicious activity in the database and notifies the
administrator of the system accordingly.
(1) Anomaly Detection Model - It analyzes a set of characteristics of the system and
compares their behavior with a set of expected values. It uses the assumption that
unexpected behavior is evidence of an intrusion.
(2) Misuse Detection Model – It determines whether a sequence of intrusions being executed
is known to violate the site’s security policy. If so, it reports a potential intrusion.
Features of IDS
SQL Injection
SQL injection is a type of an injection attack that makes it possible to execute malicious SQL
statements. These statements control a database server behind a web application.
Attacker can use SQL injection vulnerabilities to bypass application security measures. Consider
the following query:
Select id from user where username=’user1’ and password = ‘password’;
Here, the two input fields – one for user name and another for password are vulnerable to SQL
injection. The attacker alter the SQL query to get the access to the database.
Because the evaluation of ‘1’ = ‘1’ is always true, every data field is selected from all users
rather than from one specific user name.
The only way to protect SQL injection is to validate every input field.
Unit 10
PL/SQL Concepts
Cursors
• When an SQL statement is processed, Oracle creates a memory area known as context
area. A cursor is a pointer to this context area.
• It contains all information needed for processing the statement.
• In PL/SQL, the context area is controlled by cursor.
• The cursor is used to fetch and process the rows returned by SQL statement one at a time.
• There are two types of cursors: Implicit cursor and Explicit cursor
Implicit cursor
Whenever oracle executes an SQL statement such as SELECT, INSERT INTO, UPDATE and
DELETE, it creates an implicit cursor. The implicit cursor is the default cursor in PL/SQL
block. Implicit cursor’s attributes are as follow:
Explicit cursor Explicit cursors are used when you are executing a SELECT statement query
that will return more than one row. An explicit cursor is defined in the declaration section of the
PL/SQL block.
For example,
CURSOR Mycursor
IS
SELECT *FROM Student;
(2) Open the cursor: It means allocating the memory for the cursor in the context area which
thereby makes it sufficient to fetch and store records in it.
Syntax: OPEN cursor-name;
(3) Fetch the cursor: It involves retrieval of data using the fetch statement.
Syntax: FETCH cursor_name INTO variable_list;
(4) Close the cursor: To release the allocated memory of the cursor.
Syntax: Close cursor_name;
Database Triggers
Syntax
Data Integrity
Data in database must be correct and consistent.
So, data stored in database must satisfy certain types of constraints (rules).
DBMS provides different ways to implement such type of constraints (rules).
This improves data integrity in a database.
Data Security
Database should be accessible to user in a limited way.
DBMS provides way to control the access to data for different user according to their
requirement.
It prevents unauthorized access to data.
Thus, security can be improved.
Concurrent Access
Multiple users are allowed to access data simultaneously.
Concurrent access to centralized data can be allowed under some supervision.
This results in better performance of system and faster response.
Guaranteed Atomicity
Any operation on database must be atomic. This means, operation must be executed
either 100% or 0%.
This type of atomicity is guaranteed in DBMS.
List and explain the applications of DBMS.
Airlines and railways
Airlines and railways use online databases for reservation, and for displaying the schedule
information.
Banking
Banks use databases for customer inquiry, accounts, loans, and other transactions.
Education
Schools and colleges use databases for course registration, result, and other information.
Telecommunications
Telecommunication departments use databases to store information about the
communication network, telephone numbers, record of calls, for generating monthly
bills, etc.
Credit card transactions
Databases are used for keeping track of purchases on credit cards in order to generate
monthly statements.
E-commerce
Integration of heterogeneous information sources (for example, catalogs) for business
activity such as online shopping, booking of holiday package, consulting a doctor, etc.
Health care information systems and electronic patient record
Databases are used for maintaining the patient health care details in hospitals.
Monitoring Performance
The DBA monitors performance of the system.
The DBA ensures that better performance is maintained by making change in physical or
logical schema if required.
Backup and Recovery
Database should not be lost or damaged.
The task of DBA is to backing up the database on some storage devices such as DVD, CD
or Magnetic Tape or remote servers.
In case of failures, such as flood or virus attack, Database is recovered from this backup.
Explain three levels ANSI SPARC Database System. OR
Explain three level Data abstraction.
The ANSI SPARC architecture divided into three levels:
1) External level
2) Conceptual level
3) Internal level
External View A View B View C
Level
Conceptual
Level Conceptual View
Internal
Level Internal View
Internal Level
This is the lowest level of the data abstraction.
It describes how the data are actually stored on storage devices.
It is also known as a physical level.
The internal view is described by internal schema.
Internal schema consists of definition of stored record, method of representing the data
field and access method used.
Conceptual Level
This is the next higher level of the data abstraction.
application
DML compiler
program object
and organizer
code
query evaluation
engine
query processor
storage manager
disk storage
application client
application server
server
Database Tier
database system
student-name
Issue date
Degree of relationship
The degree of a relationship is the number of entity types that participate in the
relationship.
The three most common relationships in ER models are Unary, Binary and Ternary.
A unary relationship is when both participant entities in the relationship are the same
entity.
Example: Subjects may be prerequisites for other subjects.
subject
is prerequisite for
course
A1 B1
A2 B2
A3 B3
A4 B4
s
An entity in A is associated with at most (only) one entity in B and an entity in B is
associated with at most (only) one entity in A.
customer-name customer-address
A1 B1
A2 B2
A3 B3
A4 B4
An entity in A is associated with any number (zero or more) of entities in B and an entity
in B is associated with at most (only) one entity in A.
customer-name customer-address
In the one-to-many relationship a loan is connected with only one customer using
borrower and a customer is connected with more than one loans using borrower.
A1 B1
A2 B2
A3 B3
A4 B4
customer-name customer-address
In a many-to-one relationship a loan is connected with more than one customer using
borrower and a customer is connected with only one loan using borrower.
Many-to-many relationship
A B
A1 B1
A2 B2
A3 B3
A4 B4
customer-name customer-address
A customer is connected with more than one loan using borrower and a loan is
connected with more than one customer using borrower.
customer-name customer-address
payment-date
Strong Entity
Weak Entity Relationship Weak Entity
person
employee customer
ISA
aA
officer-number hours-worked
station-number hours-worked
Participation Constraint
Determines whether every member in super class must participate as a member of a
subclass or not.
It may be total (mandatory) or partial (optional).
1. Total (Mandatory)
Total specifies that every entity in the superclass must be a member of some subclass in
the specialization.
Specified by a double line in EER diagram.
2. Partial (Optional)
Partial specifies that every entity in the super class not belong to any of the subclass of
specialization.
Specified by a single line in EER diagram.
Based on these two different kinds of constraints, a specialization or generalization can
be one of four types
Total, Disjoint
Total, Overlapping
Partial, Disjoint
Partial, Overlapping.
hours hours
name id number name id number
machinery machinery
uses
id id
machinery
Fig. A Fig. B
Relationship sets work and uses could be combined into a single set. We can combine
them by using aggregation.
Aggregation is an abstraction through which relationships are treated as higher-level
entities.
For our example, we treat the relationship set work and the entity sets employee and
project as a higher-level entity set called work.
Transforming an E-R diagram with aggregation into tabular form is easy. We create a
table for each entity and relationship set as before.
The table for relationship set uses contains a column for each attribute in the primary
key of machinery and work.
PersonID Person
Phone
Name
Address Email
The initial relational schema is expressed in the following format writing the table
names with the attributes list inside a parentheses as shown below
Persons( personid, name, address, email )
Person
personid name address Email
Persons and Phones are Tables and personid, name, address and email are Columns
(Attributes).
personid is the primary key for the table : Person
PersonID Person
Phone
If you have a multi-valued attribute, take that multi-valued attribute and turn it into a
new entity or table of its own.
Then make a 1:N relationship between the new entity and the existing one.
In simple words.
1. Create a table for that multi-valued attribute.
2. Add the primary (id) column of the parent entity as a foreign key within
the new table as shown below:
First table is Persons ( personid, name, address, email )
Second table is Phones ( phoneid , personid, phone )
personid within the table Phones is a foreign key referring to the personid of Persons
Phone
phoneid personid phone
Have
PersonID Person
Phone
Name
Address Email
Let us consider the case where the Person has one wife. You can place the primary key
of the wife table wifeid in the table Persons which we call in this case Foreign key as
shown below.
Persons( personid, name, address, email , wifeid )
Wife ( wifeid , name )
Or vice versa to put the personid as a foreign key within the wife table as shown below:
Persons( personid, name, address, email )
Wife ( wifeid , name , personid)
For cases when the Person is not married i.e. has no wifeID, the attribute can set to
NULL
Persons Wife
personid name address email wifeid wifeid name
OR
Persons Wife
personid name address email wifeid name personid
Address
Has
PersonID Person
Phone
Name
Address Email
For instance, the Person can have a House from zero to many, but a House can have
only one Person.
In such relationship place the primary key attribute of table having 1 mapping in to the
table having many cardinality as a foreign key.
To represent such relationship the personid as the Parent table must be placed within
the Child table as a foreign key.
It should convert to :
Persons( personid, name, address, email )
House ( houseid, name , address, personid)
Persons House
personid name address email houseid name address personid
Has
PersonID Person
Phone
Name
Address Email
HasRelat
hasrelatid personid countryid
1. Hierarchical Model
The hierarchical model organizes data into a tree-like structure, where each record has a
single parent or root.
department
student professor
2. Network Model
This is an extension of the hierarchical model, allowing many-to-many relationships in a
tree-like structure that allows multiple parents.
B C
D E F
3. Entity-relationship Model
In this database model, relationships are created by dividing object of interest into
entity and its characteristics into attributes.
customer-name customer-address
4. Relational Model
In this model, data is organized in two-dimensional tables and the relationship is
maintained by storing a common attribute.
Explain keys.
Super key
A super key is a set of one or more attributes (columns) that allow us to identify each
tuple (records) uniquely in a relation (table).
For example, the enrollment_no, roll_no, semester with department_name of a student
is sufficient to distinguish one student tuple from another. So {enrollment_no} and
{roll_no, semester, department_name} both are super key.
Candidate key
Candidate key is a super key for which no proper subset is a super key.
For example, combination of roll_no, semester and department_name is sufficient to
distinguish one student tuple from another. But either roll_no or semester or
department_name alone or combination of any two columns is not sufficient to
distinguish one student tuple from another. So {roll_no, semester, department_name} is
candidate key.
Every candidate key is super key but every super key may not candidate key.
Primary key
A Primary key is a candidate key that is chosen by database designer to identify tuples
uniquely in a relation.
Alternate key
An Alternate key is a candidate key that is not chosen by database designer to identify
tuples uniquely in a relation.
Foreign key
A foreign key is a set of one or more attributes whose values are derived from the
primary key attribute of another relation.
What is relational algebra? Explain relational algebraic
operation.
Relational algebra is a language for expressing relational database queries.
Relation algebra is a procedural query language.
Relational algebraic operations are as follows:
Selection:-
Operation: Selects tuples from a relation that satisfy a given condition.
It is used to select particular tuples from a relation.
It selects particular tuples but all attribute from a relation.
Symbol: σ (Sigma)
Notation: σ(condition) <Relation>
Operators: The following operators can be used in a condition.
=, !=, <, >, <=,>=, Λ(AND), ∨(OR)
Student
Rno Name Dept CPI
101 Ramesh CE 8
108 Mahesh EC 6
109 Amit CE 7
125 Chetan CI 8
138 Mukesh ME 7
128 Reeta EC 6
133 Anita CE 9
Student
Rno Name Dept CPI
101 Ramesh CE 8
109 Amit CE 7
133 Anita CE 9
Projection:-
Operation: Selects specified attributes of a relation.
It selects particular attributes but all unique tuples from a relation.
Symbol: ∏ (Pi)
Notation: ∏ (attribute set) <Relation>
Consider following table
Student
Rno Name Dept CPI
101 Ramesh CE 8
108 Mahesh EC 6
109 Amit CE 7
125 Chetan CI 8
138 Mukesh ME 7
128 Reeta EC 6
133 Anita CE 9
Example: List out all students with their roll no, name and department name.
∏Rno, Name, Dept (Student)
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 2
3 – Relational Query Language
Output: The above query returns all tuples with three attributes roll no, name and
department name.
Output of above query is as follows
Student
Rno Name Dept
101 Ramesh CE
109 Amit CE
125 Chetan CI
138 Mukesh ME
133 Anita CE
Example: List out students of CE department with their roll no, name and department.
∏Rno, Name, Dept (σDept=“CE” (Student))
Output: The above query returns tuples which contain CE as department with three
attributes roll no, name and department name.
Output of above query is as follows
Student
Rno Name Dept
101 Ramesh CE
109 Amit CE
133 Anita CE
Division:-
Operation: The division is a binary relation that is written as R1 ÷ R2.
Condition to perform operation: Attributes of R2 is proper subset of attributes of R1.
The output of the division operator will have attributes =
All attributes of R1 – All attributes of R2
The output of the division operator will have tuples =
Tuples in R1, which are associated with the all tuples of R2
Symbol: ÷
Notation: R1 ÷ R2
Consider following table
Project Work
Task Student Task
Database1 Shah Database1
Database2 Shah Database2
Shah Compiler1
Vyas Database1
Vyas Compiler1
Patel Database1
Patel Database2
Cartesian product:-
Operation: Combines information of two relations.
It will multiply each tuples of first relation to each tuples of second relation.
It is also known as Cross product operation and similar to mathematical
Cartesian product operation.
Symbol: X (Cross)
Notation: Relation1 X Relation2
Resultant Relation :
If relation1 and relation2 have n1 and n2 attributes respectively, then resultant
relation will have n1 + n2 attributes from both the input relations.
If both relations have some attribute having same name, it can be distinguished
by combing relation-name.attribute-name.
If relation1 and relation2 have n1 and n2 tuples respectively, then resultant
relation will have n1*n2 tuples, combining each possible pair of tuples from both
the input relations.
R R×S
A 1 A 1 A 1
B 2 A 1 D 2
D 3 A 1 E 3
B 2 A 1
S B 2 D 2
A 1 B 2 E 3
D 2 D 3 A 1
E 3 D 3 D 2
D 3 E 3
Consider following table
Emp Dept
Empid Empname Deptname Deptname Manager
S01 Manisha Finance Finance Arun
S02 Anisha Sales Sales Rohit
S03 Nisha Finance Production Kishan
Join:-
Natural Join Operation (⋈)
Operation: Natural join will retrieve information from multiple relations. It works in
three steps.
1. It performs Cartesian product
2. Then it finds consistent tuples and inconsistent tuples are deleted
3. Then it deletes duplicate attributes
Symbol: ⋈ To perform a natural join there must be
one common attribute (column) between
Notation: Relation1 ⋈ Relation2
two relations.
Consider following table
Emp Dept
Empid Empname Deptname Deptame Manager
S01 Manisha Finance Finance Arun
S02 Anisha Sales Sales Rohit
S03 Nisha Finance Production Kishan
Example:
Emp ⋈ Dept Empname, Manager (Emp ⋈ Dept)
Empid Empname Deptname Manager Empname Manager
S01 Manisha Finance Arun Manisha Arun
S02 Anisha Sales Rohit Anisha Rohit
S03 Nisha Finance Arun Nisha Arun
College Hostel
Name Id Department Hostel_name Room_no
Anisha S02 Computer Kaveri hostel K01
Nisha S03 I.T. Godavari hostel G07
Isha Null Null Kaveri Hostel K02
College Hostel
Name Id Department Hostel_name Room_no
Manisha S01 Computer Null Null
Anisha S02 Computer Kaveri hostel K01
Nisha S03 I.T. Godavari hostel G07
Isha Null Null Kaveri Hostel K02
Set Operators
Set operators combine the results of two or more queries into a single result.
Condition to perform set operation:
Both relations (queries) must be union compatible :
Relations R and S are union compatible, if
Both queries should have same (equal) number of columns, and
Corresponding attributes should have the same data type.
Types of set operators:
1. Union
2. Intersect (Intersection)
3. Minus (Set Difference)
Union
Operation: Selects tuples those are in either or both of the relations.
Symbol : U (Union)
Notation : Relation1 U Relation2
Example :
R S RUS
A 1 A 1 A 1
B 2 C 2 B 2
D 3 D 3 C 2
F 4 E 4 D 3
E 5 F 4
E 5
E 4
Intersection
Operation: Selects tuples those are common in both relations.
Symbol : ∩ (Intersection)
Notation : Relation1 ∩ Relation2
Example
R S R∩S
A 1 A 1 A 1
B 2 C 2 D 3
D 3 D 3
F 4 E 4
E 5
Difference:-
Operation: Selects tuples those are in first (left) relation but not in second (right)
relation.
Symbol : — (Minus)
Notation : Relation1 — Relation2
Example :
R S R—S
A 1 A 1 B 2
B 2 C 2 F 4
D 3 D 3 E 5
F 4 E 4
E 5
Rename:-
Operation: It is used to rename a relation or attributes.
Symbol: ρ (Rho)
Notation: ρA(B) Rename relation B to A.
ρA(X1,X2….Xn)(B) Rename relation B to A and its attributes to X1, X2, …., Xn.
Student
Rno Name Dept CPI
101 Ramesh CE 8
108 Mahesh EC 6
109 Amit CE 7
125 Chetan CI 8
138 Mukesh ME 7
128 Reeta EC 6
133 Anita CE 9
CPI
9
Aggregate Function:-
Operation: It takes a more than one value as input and returns a single value as output
(result).
Symbol: G
Notation: G function (attribute) (relation)
Aggregate functions: Sum, Count, Max, Min, Avg.
Consider following table
Student
Rno Name Dept CPI
101 Ramesh CE 8
108 Mahesh EC 6
109 Amit CE 7
125 Chetan CI 8
138 Mukesh ME 7
128 Reeta EC 6
133 Anita CE 9
max min
9 6
(ii) Find out all customer who have an account in ‘Ahmedabad’ city and balance
is greater than 10,000.
∏customer_name (σ Branch.branch_city=“Ahmedabad” Λ σ Account.balance >10000 (Branch ⋈ Account ⋈ Depositor))
(iii) find out list of all branch name with their maximum balance.
∏branch_name , G max (balance) (Account)
Example
Consider the relation Account(ano, balance, bname).
In this relation ano can determines balance and bname. So, there is a functional
dependency from ano to balance and bname.
This can be denoted by ano → {balance, bname}.
Account:
ano balance bname
Example-1
Suppose a relation R is given with attributes A, B, C, G, H and I.
Also, a set of functional dependencies F is given with following FDs.
F = {A → B, A → C, CG → H, CG → I, B → H}
Find Closure of F.
F+ = { A → H, CG → HI, AG → I, AG → H }
Example-2
Compute the closure of the following set F of functional dependencies for relational schema
R = (A, B, C, D, E, F ):
F = (A → B, A → C, CD → E, CD → F, B → E).
F+ = { A → BC, CD → EF, A → E, AD → E, AD → F }
Example-3
Compute the closure of the following set F of functional dependencies for relational schema
R = (A, B, C, D, E ):
F = (AB → C, D → AC, D → E).
F+ = { D → A, D → C, D → ACE }
result before step2 is AB and after step 2 is ABCE which is different so repeat same as
step 2.
Step-3: Second loop
result = ABCE # for A → BC, A result so result=result BC
result = ABCEF # for E→ CF, E result so result=result CF
result = ABCEF # for B→ E, B result so result=result E
result = ABCEF # for CD→ EF, CD result so result=result
result before step3 is ABCE and after step 3 is ABCEF which is different so repeat same
as step 3.
Step-4: Third loop
result = ABCEF # for A → BC, A result so result=result BC
result = ABCEF # for E→ CF, E result so result=result CF
result = ABCEF # for B→ E, B result so result=result E
result = ABCEF # for CD→EF, CD result so result=result
result before step4 is ABCEF and after step 3 is ABCEF which is same so stop.
So Closure of {A, B}+ is {A, B, C, E, F}.
Account_Branch
Ano Balance Bname Baddress
A01 5000 Vvn Mota bazaar, VVNagar
A02 6000 Ksad Chhota bazaar, Karamsad
A03 7000 Anand Nana bazaar, Anand
A04 8000 Ksad Chhota bazaar, Karamsad
A05 6000 Vvn Mota bazaar, VVNagar
This relation can be divided with two different relations
1. Account (Ano, Balance, Bname)
2. Branch (Bname, Baddress)
These two relations are shown in below figure
Account Branch
Ano Balance Bname Bname Baddress
A01 5000 Vvn Vvn Mota bazaar, VVNagar
A02 6000 Ksad Ksad Chhota bazaar, Karamsad
A03 7000 Anand Anand Nana Bazar, Anand
A04 8000 Ksad
A05 6000 Vvn
Example
A figure shows a relation Account. This relation is decomposed into two relations
Acc_Bal and Bal_Branch.
Now, when these two relations are joined on the common attributeBalance, the
resultant relation will look like Acct_Joined. This Acct_Joined relation contains rows in
addition to those in original relation Account.
Here, it is not possible to specify that in which branch account A01 or A02 belongs.
So, information has been lost by this decomposition and then join operation.
Acct_Bal
Ano Balance
Acct_Joined
Account A01 5000
Ano Balance Bname
Ano Balance Bname A02 5000
A01 5000 Vvn
A01 5000 Vvn A01 5000 Ksad
A02 5000 Ksad Bal_Branch
A02 5000 Vvn
Balance Bname
A02 5000 Ksad
5000 Vvn
5000 Ksad
Not same
In other words, decomposition is lossy if decompose into R1 and R2 and again combine
(join) R1 and R2 we cannot get original table as R1, over X, where R is an original
relation, R1 and R2 are decomposed relations, and X is a common attribute between
these two relations.
Same
Above relation has four attributes Cid, Name, Address, Contact_no. Here address is
composite attribute which is further divided in to sub attributes as Society and City.
Another attribute TypeofAccountHold is multi valued attribute which can store more
than one values. So above relation is not in 1NF.
Problem
Suppose we want to find all customers for some particular city then it is difficult to
retrieve. Reason is city name is combined with society name and stored whole as
address.
Solution
Divide composite attributes into number of sub- attribute and insert value in proper sub
attribute. AND
Split the table into two tables in such a way that
o first table contains all attributes except multi-valued attribute and
o other table contains multi-valued attribute and
o insert primary key of first table in second table as a foreign key.
So above table can be created as follows.
2NF
A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key
attribute is fully dependent on the primary key. OR
A relation R is in second normal form (2NF) if and only if it is in 1NF and no any non-key
attribute is partially dependent on the primary key.
Example
cid ano acess_date balance bname
Above relation has five attributes cid, ano, acess_date, balance, bname and two FDS
FD1 {cid,ano}{acess_date,balance,bname} and
FD2 ano{balance,bname}
We have cid and ano as primary key. As per FD2 balace and bname are only depend on
ano not cid. In above table balance and bname are not fully dependent on primary key
but these attributes are partial dependent on primary key. So above relation is not in
2NF.
Problem
For example in case of joint account multiple customers have common accounts. If
some account says ‘A02’ is jointly by two customers says ‘C02’ and ‘C04’ then data
values for attributes balance and bname will be duplicated in two different tuples of
customers ‘C02’ and ‘C04’.
Solution
Decompose relation in such a way that resultant relation does not have any partial FD.
For this purpose remove partial dependent attribute that violets 2NF from relation.
Place them in separate new relation along with the prime attribute on which they are
full dependent.
The primary key of new relation will be the attribute on which it if fully dependent.
Keep other attribute same as in that table with same primary key.
So above table can be decomposed as per following.
3NF
A relation R is in third normal form (3NF) if and only if it is in 2NF and every non-key
attribute is non-transitively dependent on the primary key.
An attribute C is transitively dependent on attribute A if there exist an attribute B such
that: A B and B C.
Example
ano balance bname baddress
Above relation has four attributes ano, balance, bname, baddress and two FDS
FD1 ano{balance, bname, baddress} and
FD2 bnamebaddress
So from FD1 and FD2 and using transitivity rule we get anobaddress.
So there is transitively dependency from ano to baddress using bname in which
baddress is non-prime attribute.
So there is a non-prime attribute baddress which is transitively dependent on primary
key ano.
So above relation is not in 3NF.
Problem
Transitively dependency results in data redundancy.
In this relation branch address will be stored repeatedly from each account of same
branch which occupy more space.
Solution
Decompose relation in such a way that resultant relation does not have any non-prime
attribute that are transitively dependent on primary key.
For this purpose remove transitively dependent attribute that violets 3NF from relation.
Place them in separate new relation along with the non-prime attribute due to which
transitive dependency occurred. The primary key of new relation will be this non-prime
attribute.
Keep other attributes same as in that table with same primary key.
So above table can be decomposed as per following.
bname baddress
BCNF
A relation R is in BCNF if and only if it is in 3NF and no any prime attribute is transitively
dependent on the primary key. OR
A relation R is in BCNF if and only if it is in 3NF and for every functional dependency X →
Y, X should be the super key of the table OR
A relation R is in BCNF if and only if it is in 3NF and for every functional dependency X → Y, X
should be the primary key of the table.
An attribute C is transitively dependent on attribute A if there exist an attribute B such
that AB and BC.
Example
Student_Project
Student Language Guide
Mita JAVA Patel
Nita VB Shah
Sita JAVA Jadeja
Gita VB Dave
Rita VB Shah
Nita JAVA Patel
Mita VB Dave
Rita JAVA Jadeja
Above relation has five attributes cid, ano, acess_date, balance, bname and two FDS
FD1 {student,language}guide and
FD2 guidelanguage
So from FD1 and FD2 and using transitivity rule we get studentlanguage
So there is transitively dependency from student to language in which language is prime
attribute.
So there is on prime attribute language which is transitively dependent on primary key
student.
So above relation is not in BCNF.
Problem
Transitively dependency results in data redundancy.
In this relation one student have more than one project with different guide then
records will be stored repeatedly from each student and language and guides
combination which occupies more space.
Solution
Decompose relation in such a way that resultant relation does not have any prime
attribute transitively dependent on primary key.
For this purpose remove transitively dependent prime attribute that violets BCNF from
relation. Place them in separate new relation along with the non-prime attribute due to
which transitive dependency occurred. The primary key of new relation will be this non-
prime attribute.
So above table can be decomposed as per following.
4NF
A table is in the 4NF if it is in BCNF and has no non multivalued dependencies.
Example
The multi-valued dependency X Y holds in a relation R if for a dependency X → Y, if for
a single value of X, multiple (more than one) values of Y exists.
Suppose a student can have more than one subject and more than one activity.
Student_Info
Student_Id Subject Activity
100 Music Swimming
100 Accounting Swimming
100 Music Tennis
100 Accounting Tennis
150 Math Jogging
Note that all three attributes make up the Primary Key.
Note that Student_Id can be associated with many subject as well as many activities
(multi-valued dependency).
Suppose student 100 signs up for skiing. Then we would insert (100, Music, Skiing). This
row implies that student 100 skies as Music subject but not as an accounting subject, so
in order to keep the data consistent we must add one more row (100, Accounting,
Skiing). This is an insertion anomaly.
Suppose we have a relation R(A) with a multivalued dependency X Y. The MVD can
be removed by decomposing R into R1(R - Y) and R2(X U Y).
Here are the tables Normalized
RollNo StudentName
101 Raj
102 Meet
103 Suresh
SubjectID SubjectName
1 DBMS
2 DS
3 DE
We cannot decomposition any of above three tables into the sub tables so above three
tables are in 5NF.
1NF
Employee Number, Employee Name, Date of Birth, Department Code, Department
Name
Employee Number, Project Code, Project Description, Project Supervisor
2NF
Employee Number, Employee Name, Date of Birth, Department Code, Department
Name
Employee Number, Project Code,
Project Code, Project Description, Project Supervisor
3NF
Employee Number, Employee Name, Date of Birth, Department Code
Department Code, Department Name
Employee Number, Project Code
Project Code, Project Description, Project Supervisor
Execution plan
Query code generator
Result of query
∏customer-name
balalce<2500
customer
account
In our example, there is only one such operation, selection operation on account.
The inputs to the lowest level operation are relations in the database.
We execute these operations and we store the results in temporary relations.
We can use these temporary relations to execute the operation at the next level up in
the tree, where the inputs now are either temporary relations or relations stored in the
database.
In our example the inputs to join are the customer relation and the temporary relation
created by the selection on account.
The join can now be evaluated, creating another temporary relation.
By repeating the process, we will finally evaluate the operation at the root of the tree,
giving the final result of the expression.
In our example, we get the final result by executing the projection operation at the root
of the tree, using as input the temporary relation created by the join. Evaluation just
described is called materialized evaluation, since the results of each intermediate
operation are created and then are used for evaluation of the next level operations.
The cost of a materialized evaluation is not simply the sum of the costs of the operations
involved. To compute the cost of evaluating an expression is to add the cost of all the
operation as well as the cost of writing intermediate results to disk.
The disadvantage of this method is that it will create temporary relation (table) and that
relation is stored on disk which consumes space on disk.
It evaluates one operation at a time, starting at the lowest level.
Pipelining
We can reduce the number of temporary files that are produced by combining several
relations operations into pipeline operations, in which the results of one operation are
passed along to the next operation in the pipeline. Combining operations into a pipeline
eliminates the cost reading and writing temporary relations.
In this method several expression are evaluated simultaneously in pipeline by using the
result of one operation passed to next without storing it in a temporary relation.
σbranch-city=’pune’
σbranch-city=’pune’
account depositor
Search-key Pointer
The first column is the Search key that contains a copy of the primary key or candidate
key of the table. These values are stored in sorted order so that the corresponding data
can be accessed quickly.
The second column is the Data Reference or Pointer which contains a set of pointers
holding the address of the disk block where that particular key value can be found.
Explain different attributes of Indexing.
The indexing has various attributes:
Access Types: This refers to the type of access such as value based search, range access,
etc.
Access Time: It refers to the time needed to find particular data element or set of
elements.
Insertion Time: It refers to the time taken to find the appropriate space and insert a new
data.
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 1
6 – Storage Strategies
Deletion Time: Time taken to find an item and delete it as well as update the index
structure.
Space Overhead: It refers to the additional space required by the index.
Explain different Indexing Methods (Types).
Different indexing methods are:
Primary Indexing
Dense Indexing
Parse Indexing
Secondary Indexing
Clustering Indexing
Primary Indexing
If the index is created on the primary key of the table, then it is known as primary index.
These primary keys are unique to each record.
As primary keys are stored in sorted order, the performance of the searching operation is
quite efficient.
Student (RollNo, Name, Address, City, MobileNo) [RollNo is primary key]
CREATE INDEX idx_StudentRno
ON Student (RollNo);
The primary index can be classified into two types:
Dense index
Sparse index
Dense Index
In sparse index, index records are not created for every search key.
The index record appears only for a few items in the data file.
It requires less space, less maintenance overhead for insertion, and deletions but is slower
compared to the dense index for locating records.
To search a record in sparse index we search for a value that is less than or equal to value
in index for which we are looking.
After getting the first record, linear search is performed to retrieve the desired record.
In the sparse indexing, as the size of the main table grows, the size of index table also
grows.
Secondary Index
Clustering Index
Sometimes the index is created on non-primary key columns which may not be unique
for each record.
In this case, to identify the record faster, we will group two or more columns to get the
unique value and create index out of them. This method is called a clustering index.
The records which have similar characteristics are grouped, and indexes are created for
these group.
Explain B-tree.
B-tree is a data structure that store data in its node in sorted order.
We can represent sample B-tree as follows.
Root Node
Intermediary Node 11
3, 6 16, 20
Leaf Node
B-tree stores data in such a way that each node contains keys in ascending order.
Each of these keys has two references to another two child nodes.
Dynamic hashing
The drawback of static hashing is that that it does not expand or shrink dynamically as the
size of the database grows or shrinks.
In dynamic hashing, data buckets grows or shrinks (added or removed dynamically) as the
records increases or decreases.
Dynamic hashing is also known as extended hashing.
In dynamic hashing, the hash function is made to produce a large number of values.
For Example, there are three data records D1, D2 and D3 .
The hash function generates three addresses 1001, 0101 and 1010 respectively.
This method of storing considers only part of this address – especially only first one bit to
store the data.
So it tries to load three of them at address 0 and 1.
D1 0
D2
1
D3
But the problem is that no bucket address is remaining for D3.
The bucket has to grow dynamically to accommodate D3.
So it changes the address have 2 bits rather than 1 bit, and then it updates the existing
data to have 2 bit address.
Then it tries to accommodate D3.
00
D1 01
D2 10
D3
11
Partial Committed
committed
Active
Failed Aborted
read(A)
A:=A-50
write(A)
read(B)
B:= B+ 50
write(B)
read(A)
temp: A * 0.1
A: A-temp
write (A)
read(B)
B:=B +temp
write(B)
Serial schedule
Schedule that does not interleave the actions of different transactions.
In schedule 1 the all the instructions of T1 are grouped and run together. Then all the
instructions of T2 are grouped and run together.
Means schedule 2 will not start until all the instructions of schedule 1 are complete. This
type of schedules is called serial schedule.
read(A)
A:=A-50
write(A)
read(A)
temp: A * 0.1
A: A-temp
write (A)
read(B)
B:= B+ 50
write(B)
read(B)
B:=B +temp
write(B)
Equivalent schedules
Two schedules are equivalent schedule if the effect of executing the first schedule is
identical (same) to the effect of executing the second schedule.
We can also say that two schedule are equivalent schedule if the output of executing
the first schedule is identical (same) to the output of executing the second schedule.
Serializable schedule
A schedule that is equivalent (in its outcome) to a serial schedule has the serializability
property.
Example of serializable schedule
Schedule 1 Schedule 2
T1 T2 T3 T1 T2 T3
read(X) read(X)
write(X) read(Y)
read(Y) read(Z)
write(Y) write(X)
read(Z) write(Y)
write(Z) write(Z)
In above example there are two schedules as schedule 1 and schedule 2.
In schedule 1 and schedule 2 the order in which the instructions of transaction are
executed is not the same but whatever the result we get is same. So this is known as
serializability of transaction.
Schedule S
T3 T4 T6
read(Q)
write(Q)
write(Q)
write(Q)
Above schedule is view serializable but not conflict serializable because all the
transactions can use same data item (Q) and all the operations are conflict with each
other due to one operation is write on data item (Q) and that’s why we cannot
interchange any non conflict operation of any transaction.
Time Tc Tf
T1
T2
T3
T4
Page 5 1
Page 1 2
Page 4 3
Page 2 4
Page 3 5
Page 6 6
Current Shadow
Page Tale Pages Page Table
Page 5(old) 1
Page 1 2
2
Page 4 3
Page 2(old) 4
Page 3 5
Page 6 6
Page 2(new)
Page 5(new)
Advantages
No overhead of maintaining transaction log.
Recovery is quite faster, as there is no any redo or undo operations required.
Disadvantages
Copying the entire page table is very expensive.
Data are scattered or fragmented.
After each transaction, free pages need to be collected by garbage collector. Difficult to
extend this technique to allow concurrent transactions.
Table 1
Held by Wait for
Transaction 1 Transaction 2
In the above figure there are two transactions 1 and 2 and two table’s as table1 and
table 2.
Transaction 1 hold table 1 and wait for table 2. Transaction 2 hold table 2 and wait for
table 1.
Now the table 1 is wanted by transaction 2 and that is hold by transaction 1 and same
way table 2 is wanted by transaction 1 and that is hold by transaction 2. Until any one
can’t get this table they can’t precede further so this is called wait for graph. Because
both of these transaction have to wait for some resources.
When dead lock occurs
A deadlock occurs when two separate processes struggle for resources are held by one
another.
Deadlocks can occur in any concurrent system where processes wait for each other and
a cyclic chain can arise with each process waiting for the next one in the chain.
Deadlock can occur in any system that satisfies the four conditions:
1. Mutual Exclusion Condition: only one process at a time can use a resource or
each resource assigned to 1 process or is available.
2. Hold and Wait Condition: processes already holding resources may request new
resources.
3. No Preemption Condition: only a process holding a resource can release it
voluntarily after that process has completed its task or previously granted
resources cannot forcibly taken away from any process.
4. Circular Wait Condition: two or more processes forms circular chain where each
process requests a resource that the next process in the chain holds.
Explain deadlock detection and recovery.
Resource-Allocation Graph
A set of vertices V and a set of edges E.
V is partitioned into two types:
1. P = {P1, P2, …, Pn}, the set consisting of all the processes in the
system.
R1 R3 R1 R3
P1 P2 P3 P1 P2 P3
R2 R4 R2 R4
In above figure sender having data that he/she wants to send his/her data is known as
plaintext.
In first step sender will encrypt data (plain text) using encryption algorithm and some key.
After encryption the plaintext becomes ciphertext.
This ciphertext is not able to read by any unauthorized person.
This ciphertext is send to receiver.
The sender will send that key separately to receiver.
Once receiver receives this ciphertext, he/she will decrypt it by using that key send by
sender and decryption algorithm.
After applying decryption algorithm and key, receiver will get original data (plaintext) that
is sended by sender.
This technique is used to protect data when there is a chance of data theft.
In such situation if encrypted data is theft then it cannot be used (read) directly without
knowing the encryption technique and key.
Check constraint
The check constraint is used to implement business rule. So, it is also called business
rule constraint.
Example of business rule: A balance in any account should not be negative.
Business rule define a domain for a particular column.
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 1
9 – SQL Concepts
The check constraint is bound to a particular column.
Once check constraint is implemented, any insert or update operation on that table
must follow this constraint.
If any operation violates condition, it will be rejected.
Syntax : ColumnName datatype (size) check(condition)
Example :
create table Account (ano int,
Balance decimal(8,2) CHECK (balance >= 0),
Branch varchar(10));
Any business rule validations can be applied using this constraint.
A condition must be some valid logical expression.
A check constraint takes longer time to execute.
On violation of this constraint, oracle display error message like – “check constraint
violated”.
Unique Constraint
Sometime there may be requirement that column cannot contain duplicate values.
A column, defined as a unique, cannot have duplicate values across all records.
Syntax : ColumnName datatype (size) UNIQUE
Example :
create table Account (ano int UNIQUE,
Balance decimal(8,2),
Branch varchar(10));
Though, a unique constraint does not allow duplicate values, NULL values can be
duplicated in a column defined as a UNIQUE column.
A table can have more than one column defined as a unique column.
If multiple columns need to be defined as composite unique column, then only table
level definition is applicable.
Maximum 16 columns can be combined as a composite unique key in a table.
CREATE: Create is used to create the database or its objects like table, view, index etc.
Create Table
The CREATE TABLE statement is used to create a new table in a database.
Syntax:
CREATE TABLE table_name
(
Column1 Datatype(Size) [ NULL | NOT NULL ],
Column2 Datatype(Size) [ NULL | NOT NULL ],
...
);
Example:
CREATE TABLE Students
(
Roll_No int(3) NOT NULL,
Name varchar(20),
Subject varchar(20)
);
Explanation:
The column should either be defined as NULL or NOT NULL. By default, a column can
hold NULL values.
The NOT NULL constraint enforces a column to NOT accept NULL values. This enforces a
field to always contain a value, which means that you cannot insert a new record, or
update a record without adding a value to this field.
ALTER: ALTER TABLE statement is used to add, modify, or drop columns in a table.
Add Column
The ALTER TABLE statement in SQL to add new columns in a table.
Syntax:
ALTER TABLE table_name
ADD Column1 Datatype(Size), Column2 Datatype(Size), … ;
Example:
ALTER TABLE Students
ADD Marks int;
Drop Column
The ALTER TABLE statement in SQL to drop a column in a table.
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 4
9 – SQL Concepts
Syntax:
ALTER TABLE table_name
DROP COLUMN column_name;
Example:
ALTER TABLE Students
DROP COLUMN Subject;
Modify Column
The ALTER TABLE statement in SQL to change the data type/size of a column in a table.
Syntax:
ALTER TABLE table_name
ALTER COLUMN column_name datatype(size);
Example:
ALTER TABLE Students
ALTER COLUMN Roll_No float;
DROP: Drop is used to drop the database or its objects like table, view, index etc.
Drop Table
The DROP TABLE statement is used to drop an existing table in a database.
Syntax:
DROP TABLE table_name;
Example:
DROP TABLE Students;
SELECT: The SELECT statement is used to select data from a database. The data returned is
stored in a result table, called the result-set.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition;
OR
SELECT *
FROM table_name
WHERE condition;
BEGIN TRANSACTION t1
DELETE FROM STUDENT WHERE SPI <8;
COMMIT TRANSACTION t1;
Output:
Student
Rollno Name SPI
1 Raju 8
2 Hari 9
Note:
A transaction is a set of operations performed so that all operations are
guaranteed to succeed or fail as one unit.
If you place the BEGIN TRANSACTION before your SQL statement, the
transaction will automatically turn into the explicit transaction and it will
lock the table until the transaction is committed or rolled back.
2. ROLLBACK:
The ROLLBACK command is the transactional control command used to undo
transactions that have not already been saved to the database.
Rollbacks a transaction to the beginning of the transaction.
It is also used with SAVEPOINT command to jump to a savepoint in an ongoing
transaction.
You can use ROLLBACK TRANSACTION to erase all data modifications made from
the start of the transaction or to a savepoint.
Student
Rollno Name SPI
1 Raju 8
2 Hari 9
3 Mahesh 7
BEGIN TRANSACTION t1
DELETE FROM STUDENT WHERE SPI <8;
ROLLBACK TRANSACTION t1;
3. SAVEPOINT:
A SAVEPOINT is a point in a transaction when you can roll the transaction back to a
certain point without rolling back the entire transaction.
The ROLLBACK command is used to undo a group of transactions.
Student
Rollno Name SPI
1 Raju 8
2 Hari 9
3 Mahesh 7
BEGIN TRANSACTION t1
SAVE TRANSACTION s1
INSERT INTO Student Values (4,’Anil’,6);
SAVE TRANSACTION s2
INSERT INTO Student Values (5, Gita’,9);
Explanation:
privilege_name is the access right or privilege want to take back from the
user. Some of the access rights are ALL, EXECUTE, and SELECT.
CONSTRAINT fk_column
FOREIGN KEY (column1, column2, ... column_n)
REFERENCES parent_table (column1, column2, ... column_n)
ON DELETE CASCADE
);
Math function
Abs(n) Returns the absolute value of n. Select Abs(-15);
O/P : 15
Sign(n) Returns the sign of x as -1,0,1 Select Sign(-15);
O/P : -1
Date function
Getdate() Returns current date and time. Select Getdate();
O/P : 2018-09-08 10:42:02.113
Day() Returns day of a given date. Select Day(‘23/JAN/2018’);
O/P : 23
Month() Returns month of a given date. Select Month(‘23/JAN/2018’);
O/P : 1
Year() Returns year of a given date. Select Year(‘23/JAN/2018’);
O/P : 2018
Isdate() Returns 1 if the expression is a valid Select Isdate(‘31/FEB/2018’);
date, otherwise 0. O/P : 0
Datename() Returns the specified part of a given Select Datename(month,‘1-23-2018’);
date as varchar value. O/P : January
Datepart() Returns the specified part of a given Select Datepart(month,‘1-23-2018’);
date as int value. O/P : 1
Dateadd() Returns datetime after adding n Select Dateadd(day,5,‘23/JAN/2018’);
numbers of datepart to a given O/P : 2018-01-28 00:00:00.000
date.
Datediff() Returns the difference between two Select Datediff(day,5,
date values, based on the interval ‘23/JAN/2018’,’23/FEB/2018’);
specified. O/P : 31
Eomonth() Returns the last day of month. Select Eomonth(’23/FEB/2018’);
O/P : 2018-01-31
Student
Rollno Name SPI
1 Raju 8
2 Hari 9
3 Mahesh 7
4 NULL 9
5 Anil 5
1. Avg() : It returns the average of the data values.
Select Avg(SPI) FROM Student;
Output: 7
2. Sum() : It returns the addition of the data values.
Select Sum(SPI) FROM Student;
Output: 38
3. Max() : It returns maximum value for a column.
Select Max(SPI) FROM Student;
Output: 9
4. Min() : It returns Minimum value for a column.
Select Min(SPI) FROM Student;
Output: 5
5. Count() : It returns total number of values in a given column.
Select Count(Name) FROM Student;
Output: 4
6. Count(*) : It returns the number of rows in a table.
Select Count(*) FROM Student;
Output: 5
INNER JOIN
It returns records that have matching values in both tables.
Syntax:
SELECT columns
FROM table1 INNER JOIN table2
ON table1.column = table2.column;
Example:
Consider the following tables:
Student Result
RNO Name Branch RNO SPI
101 Raju CE 101 8.8
102 Amit CE 102 9.2
103 Sanjay ME 104 8.2
104 Neha EC 105 7
105 Meera EE 107 8.9
106 Mahesh ME
Output:
Inner Join
RNO Name Branch SPI
101 Raju CE 8.8
102 Amit CE 9.2
104 Neha EC 8.2
105 Meera EE 7
CROSS JOIN
When each row of first table is combined with each row from the second table, known
as Cartesian join or cross join.
SQL CROSS JOIN returns the number of rows in first table multiplied by the number of
rows in second table.
Syntax:
SELECT columns
FROM table1 CROSS JOIN table2;
Example:
Consider the following tables:
Color Size
Code Name Amount
1 Red Small
2 Blue Large
SELF JOIN
A self join is a regular join, but the table is joined with itself.
Here, we need to use aliases for the same table to set a self join between single table.
Syntax:
SELECT a.column, b.column
FROM tablename a CROSS JOIN tablename b
WHERE a.column=b.column;
Example:
Consider the following table:
Employee
EmpNo Name MngrNo
E01 Tarun E02
E02 Rohan E05
E03 Priya E04
E04 Milan NULL
E05 Jay NULL
E06 Anjana E03
Define view. What are the types of view? Write syntax to create
view of each type. Give an example of view.
Views are virtual tables that are compiled at runtime.
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 18
9 – SQL Concepts
The data associated with views are not physically stored in the view, but it is stored in
the base tables of the view.
A view can be made over one or more database tables.
Generally, we put those columns in view that we need to retrieve/query again and
again.
Once you have created the view, you can query view like as table.
TYPES OF VIEW
1. Simple View
2. Complex View
Syntax:
CREATE VIEW view_name
AS
SELECT column1, column2...
FROM table_name
[WHERE condition];
Simple View
When we create a view on a single table, it is called simple view.
In a simple view we can delete, update and Insert data and that changes are applied on
base table.
Insert operation are perform on simple view only if we have primary key and all not null
fields in the view.
Example:
Consider following table:
Employee
Eid Ename Salary Department
101 Raju 5000 Admin
102 Amit 8000 HR
103 Sanjay 3000 IT
104 Neha 7000 Sales
--Create View
CREATE VIEW EmpSelect
AS
SELECT Eid, Ename, Department
FROM Employee;
--Display View
Select * from EmpSelect;
Output
Eid Ename Department
101 Raju Admin
102 Amit HR
103 Sanjay IT
104 Neha Sales
Complex View
When we create a view on more than one table, it is called complex view.
We can only update data in complex view.
You can't insert data in complex view.
In particular, complex views can contain: join conditions, a group by clause, a n order by
clause etc.
Example:
Consider following table:
Employee ContactDetails
Eid Ename Salary Department Eid City Mobile
101 Raju 5000 Admin 101 Rajkot 1234567890
102 Amit 8000 HR 102 Ahmedabad 2345678901
103 Sanjay 3000 IT 103 Baroda 3456789120
104 Neha 7000 Sales 104 Rajkot 4567891230
--Create View
CREATE VIEW Empview
AS
SELECT Employee.Eid, Employee.Ename, ConcactDetails.City
FROM Employee Inner Join ConcactDetails
On Employee.Eid= ConcactDetails.Eid;
--Display View
Select * from Empview;
Output
Eid Ename City
101 Raju Rajkot
102 Amit Ahmedabad
103 Sanjay Baroda
104 Neha Rajkot
Execution section
Explanation
Create:-It will create a procedure.
Alter:- It will re-create a procedure if it already exists.
We can pass parameters to the procedures in three ways.
1. IN-parameters: - These types of parameters are used to send values to stored procedures.
2. OUT-parameters: - These types of parameters are used to get values from stored
procedures. This is similar to a return type in functions but procedure can return values
for more than one parameters.
3. IN OUT-parameters: - This type of parameter allows us to pass values into a procedure
and get output values from the procedure.
AS indicates the beginning of the body of the procedure.
The syntax within the brackets [ ] indicates that they are optional.
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 1
10 – PL/SQL Concepts
By using CREATE OR ALTER together the procedure is created if it does not exist and if it exists
then it is replaced with the current code (The only disadvantage of CREATE OR ALTER is that it
does not work in SQL Server versions prior to SQL Server 2016).
Advantages of procedure
Security:- We can improve security by giving rights to selected persons only.
Faster Execution:- It is precompiled so compilation of procedure is not required every
time you call it.
Sharing of code:- Once procedure is created and stored, it can be used by more than one
user.
Productivity:- Code written in procedure is shared by all programmers. This eliminates
redundant coding by multiple programmers so overall improvement in productivity.
Syntax of Trigger
CREATE [OR ALTER] TRIGGER trigger_name
ON table_name
{ FOR | AFTER | INSTEAD OF }
{ [ INSERT ] [ , ] [ UPDATE ] [ , ] [ DELETE ] }
AS
BEGIN
Executable statements
END;
CREATE [OR ALTER] TRIGGER trigger_name:- This clause creates a trigger with the given name
or overwrites an existing trigger.
[ON table_name]:- This clause identifies the name of the table to which the trigger is related.
[FOR | AFTER | INSTEAD OF]:- This clause indicates at what time the trigger should be fired. FOR
and AFTER are similar.
[INSERT / UPDATE / DELETE]:- This clause determines on which kind of statement the trigger
should be fired. Either on insert or update or delete or combination of any or all. More than one
statement can be used together separated by Comma. The trigger gets fired at all the specified
triggering event.
Example 1
Trigger to display a message when we perform insert operation on student table.
Example 2
Trigger to insert history into Audit table when we perform insert operation on student table.
CREATE TRIGGER tgr_student_forinsert
ON Student
FOR INSERT
AS
BEGIN
DECLARE @id int
SELECT @rno= rno from INSERTED
INSERT INTO Audit VALUES
('New student with rno=‘ + cast(@rno as varchar(10)) +
'is added in student table ‘)
END
3) Fetching data:-
We cannot process selected row directly. We have to fetch column values of a row into
memory variables. This is done by FETCH statement.
Syntax:-
FETCH NEXT FROM cursorname INTO variable1, variable2………
4) Processing data:-
This step involves actual processing of current row.
6) Deallocating cursor:-
It is used to delete a cursor and releases all resources used by cursor.
Syntax:-
DEALLOCATE cursorname;
Example 1:- Cursor to insert record from student table to student1 table if branch is CE.
Example 2:- Cursor to update SPI (SPI=SPI-7) if SPI remains greater than or equal to ZERO after
update.
DECLARE
@rno int, @spi decimal(8,2);
DECLARE cursor_student CURSOR
FOR SELECT rno, spi
FROM student;
OPEN cursor_student;
FETCH NEXT FROM cursor_student INTO
@rno, @spi;
WHILE @@FETCH_STATUS = 0
BEGIN
set @spi=@spi-7
if (@spi<0)
Firoz A. Sherasiya, CE Department | 3130703 – Database Management System (DBMS) 7
10 – PL/SQL Concepts
print 'SPI must be greater than 0'
else
update student
set spi=@spi
where rno=@rno
FETCH NEXT FROM cursor_student INTO
@rno, @spi;
END;
CLOSE cursor_student;
DEALLOCATE cursor_student;