0% found this document useful (0 votes)
87 views26 pages

Unit - Iii RDBMS Notes

Okay ga

Uploaded by

Devika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views26 pages

Unit - Iii RDBMS Notes

Okay ga

Uploaded by

Devika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

UNIT – III

Structure of Relational Database. Introduction to Relational database design –


Objectives – Tools- Redundancy and Data anomaly – Functional dependency –
Normalization -1NF -2NF- 3NF- BCNF – Transaction Processing – Database Security
1. Relational Database Design
 A database of an organization is an information repository that represents facts
about the organization.
 It is manipulated by some software to incorporate the changes that take place in
the organization.
 The database design is a complex process.

Feasibility Study
 When designing a database, the purpose for which the database is being
designed must be clearly defined.
 In other words the objective of creating the database must be crystal clear.
Requirement Collection and Analysis
 In requirement collection, one has to decide what data are to be stored, and to
some extent, how that data will be used.
 The people who are going to use the database must be interviewed repeatedly.
 Assumptions about the stated relationships between various parts of the data
must be questioned again and again.
Prototyping and Design
 Design implies a procedure for analyzing and organizing data into a form
suitable to support business requirements and makes use of strategic
technology.
 The three phases in relational database design are conceptual design, logical
design, and physical design.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 1


Implementation
 Database implement involves development of code for database processing,
and also the installation of new database contents, usually form existing data
sources.
1.1 Objectives of Database Design
 The objectives of database design vary from implementation to implementation.
 Some of the important factors like efficiency, integrity, privacy, security,
implement ability, flexibility have to be considered in the design of the database.
Efficiency
 Efficiency is generally considered to be the most important. Given a piece of
hardware on which the database will run and a piece of software (DBMS) to run it,
the design should make full and efficient use of the facilities provided.
 If the database is made online, then the users should interact with the database
without any time delay.
Integrity
 The term integrity means that the database should be as accurate as possible. The
problem of preserving the integrity of data in a database can be viewed at a number
of levels.
 At a low level it concerns ensuring that the data are not corrupted by hardware or
software errors.
 At a higher level, the problem of preserving database integrity concerns maintaining
an accurate representation of the real world.
Privacy
 The database should not allow unauthorized access to files.
 This is very important in the case of financial data.
 For example the bank balance of one customer should not be revealed to other
customers.
Security
 The database, once loaded, should be safe from physical corruption whether from
hardware or software failure or from unauthorized access.
 This is a general requirement of most databases.
Implementation
 The conceptual model should be simple and effective so that mapping from
conceptual model to logical model is easy.
 Moreover while designing the database; care has to be taken such that application
programs should interact effectively with the database.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 2


Flexibility
 The database should not be implemented in a rigid way that assumes the business will
remain constant forever.
 Changes will occur and the database must be capable of responding readily to such
change.
 Other than the factors which were mentioned above, the design of the database should
ensure that data redundancy is not there.
1.2.Database Design Tools
 Once the objectives of the database design and the various steps in database design is
known, it is essential to know the database design tools which are used to automate
the task of designing a business system.
 Using automated design tools is the process of using a GUI tool to assist in the design
of a database or database application. Many database design tools are available with a
variety of features. The design tools are vendor-specific.
 CASE tools are software that provides automated support for some portion of the
systems development process.
 Database drawing tools are used in enterprise modeling, conceptual data modeling,
logical database design, and physical data modeling.
1.2.1 Need for Database Design Tool
 The database design tools increase the overall productivity because the manual
tasks are automated and less time is spent in performing tedious tasks and
more time is spent in thinking about the actual design of the database.
 The quality of the end product is improved in using database design tools;
because the design tool automates much of the design process as a result the
time taken to design a database is reduced..
1.2.2 Desired Features of Database Design Tools
 The database design tools should help the developer to complete the database model
of database application in a timely fashion. Some of the features of the database
design tools are given below:
– The database design tools should capture the user needs.
– The capability to model the flow of data in an organization.
– The database design tool should have the capability to model entities and their
relationships.
– The database design tool should have the capability to generate Data
Definition Language (DDL) to create database object.
– The database design tool should support full life cycle database support.
– Database and application version control.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 3


– The database design tool should generate reports for documentation and
user-feedback sessions.
1.2.3 Advantages of Database Design Tools
 Some of the advantages of using database design tools for system design or
application development are given as:
– The amount of code to be written is reduced as a result the database design time is
reduced.
– Chances of errors because of manual work are reduced.
– Easy to convert the business model to working database model.
– Easy to ensure that all business requirements are met with.
– A higher quality, more accurate product is produced.
1.2.4. Disadvantages of Database Design Tools
 Some of the disadvantages of database design tools are given below:
– More expenses involved for the tool itself.
– Developers might require special training to use the tool.
1.2.5. Commercial Database Design Tools
The database design tools which are commercially popular are given along with their
websites.
1. CASE Studio 2 – Powerful database modeling, management, and reporting tool.
http://www.casestudio.com/enu/default.aspx
2. Design for Databases V3 – Database development tool using an entity
relationship diagram.
http://www.datanamic.com/dezign
3. DBDesigner4 – Visual database design system that integrates database design,
modeling.
4. ER/Studio – Multilevel data modeling application for logical and physical
database design and construction.
http://www.embarcadero.com/products/erstudio/index.html
5. Happy Fish Database Designer – Visual database design tool supporting
multiple database platforms. Happy Fish generates complete DDL scripts, defining
metadata with table creates, indexes, foreign keys.
http://www.embarcadero.com/products/erstudio/index.html
6. Oracle Designer 2000 – Provides complete toolset to model, generate, and
capture the requirements and design of enterprise applications.
http://www.Oracle.com/technology/products/designer/index.html
7. QDesigner – QDesigner is an enterprise modeling and design solution that

Prepared By : Mrs.SHUNMUGA PRIYA K Page 4


empowers architects, DBAs, developers, and business analysts to produce IT solutions.
http://www.quest.com/QDesigner
8. Power designer – The PowerDesigner product family offers a modeling solution
that analysts, DBAs, designers, and developers can tailor. Its modular structure offers
affordability and expandability, so the tools can be applied according to the size and
scope of the project. http://www.sybase.com/products/powerdesigner/
9. Web Objects – A product from Apple. WebObject helps to develop and deploy
enterprise-level web services and java server applications.
http://www.apple.com/webobjects/
10. xCase – Database design tools which provides datamodeling environment.
www.xcase.com
2. Redundancy and Data Anomaly
 Redundant data means storing the same information more than once, i.e., redundant
data could be removed without the loss of information.
 Redundancy can lead to anomalies. The different anomalies are insertion, deletion,
and updation anomalies.
2.1.Problems of Redundancy
 Redundancy can cause problems during normal database operations.
 For example, when data are inserted into the database, the data must be duplicated
wherever redundant versions of that data exist.
 Also when the data are updated, all redundant data must be simultaneously updated
to reflect that change.
2.2 Insertion, Deletion, and Updation Anomaly
 A table anomaly is a structure for which a normal database operation cannot be
executed without information loss or full search of the data table.
 The table anomaly can be broadly classified into (1) Insertion Anomaly, (2)
Deletion Anomaly, and (3) Update or Modification Anomaly.

Insertion Anomaly
 We cannot insert a department without inserting a member of staff that works in that
department.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 5


Update Anomaly
 We could change the name of the department that “100” works in without
simultaneously changing the department that “102” works.
Deletion Anomaly
 By removing, employee 100, we have removed all information pertaining to the sales
department.
Repeating Group
 A repeating group is an attribute (or set of attributes) that can have more than one value
for a primary key value.
 To understand the concept of repeating group, consider the example of the table
STAFF.
 A staff can have more than one contact number. For each contact number, we
have to store the data of the STAFF which leads to more storage space (more
memory).

 Repeating groups are not allowed in a relational design, since all attributes have to be
atomic,
 i.e., there can only be one value per cell in a table.

3. Functional dependency in DBMS

 The attributes of a table is said to be dependent on each other when an attribute


of a table uniquely identifies another attribute of the same table.
 For example: Suppose we have a student table with attributes: Stu_Id,
Stu_Name, Stu_Age.
 Here Stu_Id attribute uniquely identifies the Stu_Name attribute of
student table because if we know the student id we can tell the student
name associated with it.
 This is known as functional dependency and can be written as Stu_Id-
>Stu_Name or in words we can say Stu_Name is functionally dependent
on Stu_Id.
Formally:

 If column A of a table uniquely identifies the column B of same table then it can
represented as A->B (Attribute B is functionally dependent on attribute A)

Prepared By : Mrs.SHUNMUGA PRIYA K Page 6


 Advantages of Functional Dependency
 Functional Dependency avoids data redundancy. Therefore same data do not repeat
at multiple locations in that database
 It helps you to maintain the quality of data in the database
It helps you to defined meanings and constraints of databases
 It helps you to identify bad designs
 It helps you to find the facts regarding the database design

3.1. Types of Functional Dependencies


1. Trivial functional dependency
2. non-trivial functional dependency
3. Multivalued dependency
4. Transitive dependency
1. Trivial functional dependency in DBMS with example
 The dependency of an attribute on a set of attributes is known as trivial functional
dependency if the set ofattributes includes that attribute.
 Symbolically: A ->B is trivial functional dependency if B is a subset of A.
 The following dependencies are also trivial: A->A & B->B
 For example: Consider a table with two columns Student_id and Student_Name.
 {Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as
Student_Id is a subset of {Student_Id, Student_Name}.
 That makes sense because if we know the values of Student_Id and
Student_Name then the value of Student_Id can be uniquely determined.
 Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial
dependencies too.
2. Non trivial functional dependency in DBMS
 If a functional dependency X->Y holds true where Y is not a subset of X then
this dependency is called non trivial Functional dependency.
 For example: An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
 emp_id -> emp_name (emp_name is not a subset of emp_id) emp_id
-> emp_address (emp_address is not a subset of emp_id) On the other
hand, the following dependencies are trivial:
 {emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id,
emp_name}] Refer: trivial functional dependency.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 7


Completely non trivial FD:
 If a FD X->Y holds true where X intersection Y is null then this dependency is
said to be completely nontrivial function dependency.
3. Multivalued dependency in DBMS

 Multivalued dependency occurs when there are more than one independent
multivalued attributes in atable.
 For example: Consider a bike manufacture company, which produces two
colors (Black and white) ineach model every year.
bike_model manuf_year color

M1001 2007 Black


M1001 2007 Red
M2012 2008 Black
M2012 2008 Red
M2222 2009 Black
M2222 2009 Red
 Here columns manuf_year and color are independent of each other and dependent
on bike_model.
 In this case these two columns are said to be multivalued dependent on
bike_model. These dependencies can be represented like this:
 bike_model ->> manuf_year bike_model ->>
color
4. Transitive dependency in DBMS

 A functional dependency is said to be transitive if it is indirectly formed by two


functional dependencies.

 For e.g. X -> Z is a transitive dependency if the following three functional


dependencies hold true: X->Y

 Y does not ->X


 Y->Z
 Note: A transitive dependency can only occur in a relation of three of more
attributes. This dependency helps us normalizing the database in 3NF (3rd
Normal Form).
Example: Let’s take an example to understand it better:

Prepared By : Mrs.SHUNMUGA PRIYA K Page 8


Book Author Author_age
Game of Thrones George R. R. Martin 66
Harry Potter J. K. Rowling 49
Dying of the Light George R. R. Martin 66
 {Book} ->{Author} (if we know the book, we knows the author name)
 {Author} does not ->{Book}
 {Author} -> {Author_age}
 Therefore as per the rule of transitive dependency: {Book} -> {Author_age}
should hold, that makes sense because if we know the book name we can know
the author’s age.

4. Normalization
 Normalization allows us to minimize insert, update, and delete anomalies and help
maintain data consistency in the database.
1. To avoid redundancy by storing each fact within the database only once
2. To put data into the form that is more able to accurately accommodate change
3. To avoid certain updating “anomalies”
4. To facilitate the enforcement of data constraint
5. To avoid unnecessary coding.
 Extra programming in triggers, stored procedures can be required to handle the non-
normalized data and this in turn can impair performance significantly.
 Here are the most commonly used normal forms:
 First normal form(1NF)
 Second normal form(2NF)
 Third normal form(3NF)
 Boyce & Codd normal form (BCNF)

Prepared By : Mrs.SHUNMUGA PRIYA K Page 9


First normal form (1NF)
 A table is in first normal form (1NF) if and only if all columns contain only
atomic values; that is, there are no repeating groups (columns) within a row.
 It is to be noted that all entries in a field must be of same kind and each field must
have a unique name, but the order of the field (column) is irrelevant.
 Each record must be unique and the order of the rows is irrelevant.
 As per the rule of first normal form, an attribute (column) of a table cannot hold
multiple values.
 It should hold only atomic values.
 Example: Suppose a company wants to store the names and contact details of its
employees. It creates atable that looks like this:

emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

8812121212
102 Jon Kanpur 9900012222

103 Ron Chennai 7778881212

9990000123
104 Lester Bangalore
8123450987

Prepared By : Mrs.SHUNMUGA PRIYA K Page 10


 Two employees (Joxn & Lester) are having two mobile numbers so the company
stored them in the same field as you can see in the table above.
 This table is not in 1NF as the rule says “each attribute of a table must have
atomic (single) values”, the emp_mobile values for employees Jon & Lester
violates that rule.
 To make the table complies with 1NF we should have the data like this:
emp_id emp_name emp_address emp_mobile

101 Herschel New Delhi 8912312390

102 Jon Kanpur 8812121212

102 Jon Kanpur 9900012222

103 Ron Chennai 7778881212

104 Lester Bangalore 9990000123

104 Lester Bangalore 8123450987

Second normal form (2NF)

A table is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is
fully dependent on the primary key.

A table is said to be in 2NF if both the following conditions hold:


 Table is in 1NF (First normal form)
 No non-prime attribute is dependent on the proper subset of any candidate key of
table. An attribute that is not part of any candidate key is known as non-prime attribute.

Example: Suppose a school wants to store the data of teachers and the subjects they
teach. They create a table that looks like this: Since a teacher can teach more than one
subjects, the table can have multiple rows for a same teacher.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 11


teacher_id subject teacher_age

111 Maths 38

111 Physics 38

222 Biology 38

333 Physics 40

333 Chemistry 40
Candidate Keys: {teacher_id, subject}
Non prime attribute: teacher_age
The table is in 1 NF because each attribute has atomic values. However, it is not in
2NF because non prime attribute teacher_age is dependent on teacher_id alone which is
a proper subset of candidate key.
This violates the rule for 2NF as the rule says “no non-prime attribute is dependent on
the proper subset of any candidate key of the table”.
Teacher details table:

teacher_id teacher_age

111 38

222 38

333 40
Teacher subject table:

teacher_id subject

111 Maths

111 Physics

222 Biology

333 Physics

333 Chemistry

Prepared By : Mrs.SHUNMUGA PRIYA K Page 12


Now the tables comply with Second normal form (2NF).

Third Normal form (3NF)


To be in Third Normal Form (3NF) the relation must be in 2NF and no transitive dependencies
may exist within the relation. A transitive dependency is when an attribute is indirectly
functionally dependent on the key (that is, the dependency is through another non-key attribute).
A table design is said to be in 3NF if both the following conditions hold:
 Table must be in 2NF
 Transitive functional dependency of non-prime attribute on any super key should be
removed.
An attribute that is not part of any candidate key is known as non-prime attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functionaldependency X-> Y at least one of the following conditions hold:

 X is a super key of table


 Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as prime attribute.

emp_id emp_name emp_zip emp_state emp_city emp_district

1001 John 282005 UP Agra Dayal Bagh

1002 Ajeet 222008 TN Chennai M-City

1006 Lora 282007 TN Chennai Urrapakkam

1101 Lilly 292008 UK Pauri Bhagwan

1201 Steve 222999 MP Gwalior Ratan

Example: Suppose a company wants to store the complete address of each employee,
they create a table named employee_details that looks like this:
 Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on
 Candidate Keys: {emp_id}
 Non-prime attributes: all attributes except emp_id are non-prime as they are
not part of any candidatekeys.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 13


 Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip
is dependent on emp_id that makes non-prime attributes (emp_state, emp_city &
emp_district) transitively dependent on super key (emp_id). This violates the rule
of 3NF.
 To make this table complies with 3NF we have to break the table into two tables
to remove the transitive dependency:
Employee table:

emp_id emp_name emp_zip

1001 John 282005

1002 Ajeet 222008

1006 Lora 282007

1101 Lilly 292008

1201 Steve 222999


Employee zip table:

emp_zip emp_state emp_city emp_district

282005 UP Agra Dayal Bagh

222008 TN Chennai M-City

282007 TN Chennai Urrapakkam

292008 UK Pauri Bhagwan

222999 MP Gwalior Ratan

Boyce Codd normal form (BCNF)


 It is an advance version of 3NF that’s why it is also referred as 3.5NF. BCNF is stricter
than 3NF. A table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.
 Example: Suppose there is a company wherein employees work in more than one
department. They store the data like this:

Prepared By : Mrs.SHUNMUGA PRIYA K Page 14


emp_idemp_nationality emp_dept dept_typedept_no_of_emp

1001 Austrian Production and planning D001 200

1001 Austrian Stores D001 250

design and technical


1002 American support D134 100

1002 American Purchasing department D134 600

Functional dependencies in the table above:


 emp_id -> emp_nationality
 emp_dept -> {dept_type, dept_no_of_emp}
 Candidate key: {emp_id, emp_dept}
 The table is not in BCNF as neither emp_id nor emp_dept alone are keys.
 To make the table comply with BCNF we can break the table in three tables
like this:
emp_nationality table:

emp_id emp_nationality

1001 Austrian

1002 American

Prepared By : Mrs.SHUNMUGA PRIYA K Page 15


emp_dept table:

emp_dept dept_type dept_no_of_emp

Production and planning D001 200

stores D001 250

design and technical support D134 100

Purchasing department D134 600

emp_dept_mapping table:

emp_id emp_dept

1001 Production and planning

1001 stores

1002 design and technical support

1002 Purchasing department

Fifth Normal Form (5NF)


 The Fifth Normal Form concerns dependencies that are obscure.
 Domain/Key Normal Form (DK/NF)
 To be in Domain/Key Normal Form (DK/NF) every constraint on the relation
 must be a logical consequence of the definition of keys and domains.
 Functional dependencies:
 emp_id -> emp_nationality
 emp_dept -> {dept_type, dept_no_of_emp}
 Candidate keys:
 For first table: emp_id
 For second table: emp_dept
 For third table: {emp_id, emp_dept}
This is now in BCNF as in both the functional dependencies left side part is a key.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 16


5. TRANSACTION PROCESSING
 A transaction is a logical unit of work (comprising one or more SQL statements) performed on
the database to complete a common task and maintain data consistency.
 Transaction statements are closely related and perform interdependent actions. Each statement
performs part of the task, but all of them are required for the complete task.
 Transaction processing ensures that related data is added to or deleted from the database
simultaneously, thus preserving data integrity in your application.
 In transaction processing, data is not written to the database until a commit command is issued.
When this happens, data is permanently written to the database.
 For example, if a transaction comprises database operations to update two database tables,
either all updates are made to both tables, or no updates are made to either table.
 This condition guarantees that the data remains in a consistent state and the integrity of the data
is maintained.
 You see a consistent view of the database during a transaction. You do not see changes from
other users during a transaction.

Key Notations in Transaction Management


 The key notations in transaction management are as follows:
 Object. The smallest Data item which is read or updated by the Transaction is called as Object
in this case.
 Transaction. Transaction is represented by the symbol T. It is termed as the execution of query
in DBMS.
 Read Operation. Read operation on particular object is notated by symbol R (object-name).
 Write Operation. Write operation on particular object is notated by symbol W (object-name).
 Commit. This term used to denote the successful completion of one Transaction.
 Abort. This term used to denote the unsuccessful interrupted Transaction.
 These properties are called as ACID properties.
 ACID Properties of DBMS
 ACID is an acronym for Atomicity, Consistency, Isolation, and Durability.
 A – Atomicity
 C – Consistency
 I – Isolation
 D – Durability
 Atomicity and Durability are closely related.
 Consistency and Isolation are closely related.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 17


Atomicity and Durability
Atomicity
 The meaning is the transaction cannot be subdivided, and hence, it must be processed in its
entirety or not at all.
 Users should not have to worry about the effect of incomplete Transactions in case of any
system crash occurs.
 Transactions can be incomplete for three kinds of reasons:
 Transaction can be aborted, or terminated unsuccessfully. This happens due to some
anomalies arises during execution. If a transaction is aborted by the DBMS for some internal
reason, it is automatically restarted and executed as new.
 Due to system crash. This may be happen due to Power Supply failure while one or more
Transactions in execution.
 Due to unexpected situations. This may be happen due to unexpected data value or be unable
to access some disk. So the transaction will decide to abort. (Terminate itself).
Durability
 If the System crashes before the changes made by a completed Transaction are written to disk,
then it should be remembered and restored during the system restart phase.
Consistency
 Users are responsible for ensuring transaction consistency. User who submits
 the transaction should make sure the transaction will leave the database in a
 consistent state.

Isolation
 In DBMS system, there are many transaction may be executed simultaneously.
 These transactions should be isolated to each other. One’s execution should not affect the
execution of the other transactions. To enforce this concept
 DBMS has to maintain certain scheduling algorithms. One of the scheduling algorithms used is
Serial Scheduling.
Serial Scheduling
 In this scheduling method, transactions are executed one by one from the start to finish.
 An important technique used in this serial scheduling is interleaved execution.

Types of Failures:
 Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:
1. A computer failure (system crash): A hardware, software, or network error occurs in the computer
system during transaction execution. Hardware crashes are usually media failures – for example, main
memory failure.
2. A transaction or system error: Some operations in the transaction may cause it to fail, such as
integer overflow or division by zero. Transaction failure may also occur because of erroneous
parameter values or because of a logical programming error.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 18


3. Local errors or exception conditions detected by the transaction: During transaction execution,
certain conditions may occur that necessitate cancellation of the transaction. For example, data for the
transaction may not be found, insufficient balance in bank account, etc.
4. Concurrency control environment: The concurrency control method may decide to abort the
transaction, to be restarted later, because several transactions are in a state of deadlock.
5. Disk failure: Some disk blocks may lose their data because of a read or write malfunction of
because of a disk read/write head crash.
6. Physical problems and catastrophes: This refers to an endless list of problems that includes power
or air-conditioning failure, fire, theft, overwriting disks or tapes by mistake, etc.

Transaction States:
 A transaction is an atomic unit of work that is either completed in its entirety or not done at all.
 For recovery purposes, the system needs to keep track of when the transaction starts,
terminates, and commits or aborts.
 Therefore, the recovery manager keeps track of the following operations:
1. Begin transaction: This marks the beginning of transaction execution.
2. Read or write: These specify read or write operations on the database items that are executed
as part of a transaction.
3. End transaction: This specifies that read and write transaction operations have ended and
marks the end of transaction execution.
4. Commit transaction: This signals a successful end of the transaction so that any changes
(updates) executed by the transaction can be safely committed to the database and will not be
undone.
5. Rollback (or abort): This signals that the transaction has ended unsuccessfully; so that any
changes or effects that the transaction may have applied to the database must be undone.

Types of locks:
 Several types of locks are used in concurrency control such as binary locks and
shared/exclusive locks.
 Binary Locks: A binary lock can have two states or values:
 locked and unlocked (or 1 and 0, for simplicity). A distinct lock is associated with each
database item X. If the value of the lock on X is 1, item X cannot be accessed by a database
operation that requests the item. If the value of the lock on X is 0, the item can be accessed

Prepared By : Mrs.SHUNMUGA PRIYA K Page 19


when requested. We refer to the current value (or state) of the lock associated with item X as
lock(X).
 Two operations, lock_item and unlock_item, are used with binary locking.
 Lock_item(X): A transaction requests access to an item X by first issuing a lock_item(X)
operation. If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1
(the transaction locks the item) and the transaction is allowed to access item X.
 Unlock_item (X): When the transaction is through using the item, it issues an
unlock_item(X) operation, which sets LOCK(X) to 0 (unlocks the item) so that X may be
accessed by other transactions. Hence, a binary lock enforces mutual exclusion on the data
item; i.e., at a time only one transaction can hold a lock.
Shared/Exclusive (or Read/Write) Lock: Shared lock:
 These locks are referred to as read locks. If a transaction T has obtained Shared-lock on data
item X, then T can read X, but cannot write X. Multiple Shared lock can be placed
simultaneously on a data item.

Deadlocks:
 A deadlock is a condition in which two (or more) transactions in a set are waiting
simultaneously for locks held by some other transaction in the set.
 Neither transaction can continue because each transaction in the set is on a waiting queue,
waiting for one of the other transactions in the set to release the lock on an item.
 Thus, a deadlock is an impasse that may result when two or more transactions are each waiting
for locks to be released that are held by the other.
 Transactions whose lock requests have been refused are queued until the lock can be granted.
 A deadlock is also called a circular waiting condition where two transactions are waiting
(directly or indirectly) for each other.
 Thus in a deadlock, two transactions are mutually excluded from accessing the next record
required to complete their transactions.
 Example: A deadlock exists two transactions A and B exist in the following example:
Transaction A=access data items X and Y Transaction B=access data items Y and X Here,
Transaction-A has acquired lock on X and is waiting to acquire lock on y. While, Transaction-
B has acquired lock on Y and is waiting to acquire lock on X. But, none of them can execute
further.
 Deadlock Detection and Prevention:
 Deadlock detection:
 This technique allows deadlock to occur, but then, it detects it and solves it. Here, a
database is periodically checked for deadlocks.
 If a deadlock is detected, one of the transactions, involved in deadlock cycle, is aborted.
Other transactions continue their execution.
 An aborted transaction is rolled back and restarted.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 20


 Deadlock Prevention:
 Deadlock prevention technique avoids the conditions that lead to deadlocking. It
requires that every transaction lock all data items it needs in advance. If any of the items
cannot be obtained, none of the items are locked.
 In other words, a transaction requesting a new lock is aborted if there is the possibility
that a deadlock can occur. Thus, a timeout may be used to abort transactions that have
been idle for too long.
 This is a simple but indiscriminate approach. If the transaction is aborted, all the
changes made by this transaction are rolled back and all locks obtained by the
transaction are released.
 The transaction is then rescheduled for execution. Deadlock prevention technique is
used in two-phase locking.
 Time-Stamp Methods for Concurrency control:
 Timestamp is a unique identifier created by the DBMS to identify the relative starting
time of transaction.
 Typically, timestamp values are assigned in the order in which the transactions are
submitted to the system.
 So, a timestamp can be thought of as the transaction start time.
 Therefore, time stamping is a method of concurrency control in which each transaction
is assigned a transaction timestamp.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 21


 Exclusive lock: These Locks are referred to as write locks. If a transaction T has obtained
Exclusive lock on data item X, then T can be read as well as write X. Only one Exclusive lock
can be placed on a data item at a time. This means that a single transaction exclusively holds
the lock on the item.
 Two-Phase Locking (2PL): A transaction is said to follow the two-phase locking protocol if
all locking operations (read_lock, write_lock) precede the first unlock operation in the
transaction. Such a transaction can be divided into two phases: an expanding or growing (first)
phase, during which new locks on items can be acquired but none can be released; and a
shrinking (second) phase, during which existing locks can be released but no new locks can be
acquired.

6. Database Security
 Security refers to the protection of data against unauthorized disclosure, alteration, or
destruction; integrity refers to the accuracy or validity of that data.
 To put it a little glibly:
– Security means protecting the data against unauthorized users.
– Integrity means protecting the data against authorized users.

 The database security system stores authorization rules and enforces them for each database
access.
 The authorization rules define authorized users, allowable operations, and accessible parts of a
database.
 When a group of users access the data in the database, then privileges can be assigned to
groups rather than individual users. Users are assigned to groups and given passwords.

Need for Database Security


 The need for database security is given below:
 In the case of shared data, multiple users try to access the data at the same time. In order to
maintain the consistency of the data in the database, database security is needed.
 Due to the advancement of internet, data are accessed through World Wide Web, to protect the
data against hackers, database security is needed.
 The plastic money (Credit card) is more popular. The money transaction has to be safe.
 More specialized software both to enter the system illegally to extract data and to analyze the
information obtained is available.
 Hence, it is necessary to protect the data/money.
General Considerations
 There are numerous aspects to the security problem, some of them are:
– Legal, social, and ethical aspects
– Physical controls
– Policy questions
– Operational problems

Prepared By : Mrs.SHUNMUGA PRIYA K Page 22


– Hardware control
– Operating system support
– Issues that are the specific concern of the database system itself
 There are two broad approaches to data security. The approaches are known as discretionary
and mandatory control, respectively.
 In both cases, the unit of data or “data object” that might need to be protected can range all the
way from an entire database on the one hand to a specific component within a specific tuple on
the other.

Database Security Goals and Threats


 Some of the goals and threats of database security are given below:
 Goal. Confidentiality (secrecy or privacy). Data are only accessible (readtype) by authorized
subjects (users or processes).
 Threat. Improper release of information caused by reading of data through intentional or
accidental access by improper users.
 This includes inferring of unauthorized data from authorized observations from data.
 Goal. To ensure data integrity which means data can only be modified by authorized subjects.
 Threat. Improper handling or modification of data.
 Goal. Availability (denial of service). Data are accessible to authorized subjects.
 Threat. Action could prevent subjects from accessing data for which they are authorized.

Classification of Database Security


 The database security can be broadly classified into physical and logical security.
 Database recovery refers to the process of restoring database to a correct state in the event of a
failure.
 Physical security. Physical security refers to the security of the hardware associated with the
system and the protection of the site where the computer resides. Natural events such as fire,
floods, and earthquakes can be considered as some of the physical threats. It is advisable to
have backup copies of databases in the face of massive disasters.
 Logical security. Logical security refers to the security measures residing in the operating
system or the DBMS designed to handle threats to the data. Logical security is far more
difficult to accomplish.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 23


Database Security at Design Level
 It is necessary to take care of the database security at the stage of database design. Few
guidelines to build the most secure system are:
 The database design should be simple. If the database is simple and easier to use, then the
possibility that the data being corrupted by the authorized user is less.
 The database has to be normalized. The normalized database is almost free from update
anomalies. It is harder to impose normalization on the relations after the database is in use.
 Hence, it is necessary to normalize the database at the design stage itself.
 The designer of the database should decide the privilege for each groupof users.
 If no privileges are assumed by any user, there is less likelihood that a user will be able to gain
illegal access.
 Create unique view for each user or group of users. Although “VIEW”promotes security by
restricting user access to data, they are not adequatesecurity measures, because unauthorized
persons may gain knowledge of or access to a particular view.

Database Security at the Maintenance Level


 Once the database is designed, the database administrator is playing a crucial role in the
maintenance of the database. The security issues with respect to maintenance can be classified
into:
1. Operating system issues and availability
2. Confidentiality and accountability through authorization rules
3. Encryption
4. Authentication schemes
(1) Operating System Issues and Availability
 The system administrator normally takes care of the operating system security. The database
administrator is playing a key role in the physical security issues.
 The operating system should verify that users and application programs attempting to access
the system are authorized.
 Accounts and passwords for the entire database system are handled by the database
administrator.
(2) Confidentiality and Accountability
 Accountability means that the system does not allow illegal entry. Accountability is related to
both prevention and detection of illegal actions.
 Accountability is assured by monitoring the authentication and authorization of users.
 Authorization rules are controls incorporated in the data management system that restrict
access to data and also restrict the actions that people may take when they access data.
 Authentication can be carried out by the operating system level or by the relational database
management system (RDBMS).
 In case, the system administrator or the database administrator creates for every user an
individual account or username. In addition to these accounts, users are also assigned
passwords.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 24


Authentication Schemes
 Authentication schemes are the mechanisms that determine whether a user is who he or she
claims to be. Authentication can be carried out at the operating system level or by the RDBMS.
 The database administrator creates for every user an individual account or user name. In
addition to these accounts, users are also assigned passwords.
 A password is a sequence of characters, numbers, or a combination of both which is known
only to the system and its legitimate user.
 Since the password is the first line of defense against unauthorized use by outsiders, it needs to
be kept confidential by its legitimate user.
 It is highly recommended that users change their password frequently

Database Security through Access Control


 A database for an enterprise contains a great deal of information and usually has several groups
of users.
 Most users need to access only a small portion of the database which is allocated to them.
Allowing users unrestricted access to all the data can be undesirable, and a DBMS should
provide mechanisms to access the data.
 Especially, it is a way to control the data accessible by a given user.
 Two main mechanisms of access control at the DBMS level are:
– Discretionary access control
– Mandatory access control

Discretionary Access Control


 Discretionary access control regulates all user access to named objects through privileges,
based on the concept of access rights or privileges for objects (tables and views), and
mechanisms for giving users’ privileges (and revoking privileges).
 A privilege allows a user to access some data object in a manner (to read or modify).
 Creator of a table or a view automatically gets all privileges on it.
 DBMS keeps track of who subsequently gains and loses privileges, and ensures that only
requests from users who have the necessary privileges (at the time the request is issued) are
allowed.
Mandatory Access Control
 It is based on system-wide policies that cannot be changed by individual users. In this each DB
object is assigned a security class.
 Each subject (user or user program) is assigned a clearance for a security class.
 Rules based on security classes and clearances govern who can read/write which objects.
 Most commercial systems do not support mandatory access control.
 Versions of some DBMSs do support it; used for specialized (e.g., military) applications.
 Mandatory controls are applicable to databases in which the data have a rather static and rigid
classification structure, as might be the case in certain military or government environments.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 25


Discretionary Protection
 Class C is divided into two subclasses C1 and C2 (where C1 is less secures that C2), each
supports discretionary controls, meaning that access is subject to the discretion of the data
owner. In addition:
1. Class C1 distinguishes between ownership and access, i.e., it supports the concept of
shared data, while allowing users to have private data of their own as well.
2. Class C2 additionally requires accountability support through sign-on procedures,
auditing, and resource isolation.
Mandatory Protection
 Class B is the class that deals mandatory controls. It is further divided into subclasses B1, B2,
and B3, as follows:
1. Class B1 requires “labeled security protection” (i.e., it requires each data object to be
labeled with its classification level – secret, confidential, etc.). It also requires an informal
statement of the security policy in effect.
2. Class B2 additionally requires a formal statement of the same thing. It also requires that
covert channels be identified and eliminated.
Examples : of covert channels might be the possibility of inferring the answer to an illegal
query from the answer to a legal one.
3. Class B3 specifically requires audit and recovery support as well as a designated security
administrator.
Verified Protection
 Class A, the most secure, requires a mathematical proof that the security mechanism is
consistent and that it is adequate to support the specified security policy.
 Several commercial DBMS products currently provide mandatory controls at the B1 level.
They also typically provide discretionary controls at the C2 level.

Terminology: DBMS’s that support mandatory controls are sometimes called multilevel secure
systems. The term trusted system is also used with much the same meaning.

Prepared By : Mrs.SHUNMUGA PRIYA K Page 26

You might also like