0% found this document useful (0 votes)
89 views269 pages

Database Management System 1

This document provides an introduction to database systems. It discusses what a database and DBMS are, the limitations of file processing systems, different types of databases, and advantages of DBMS over file systems. Specifically, it covers centralized and distributed databases, single-user and multi-user databases, and how DBMS addresses issues like redundancy, data sharing, security, and data integrity. It also introduces the concept of data modeling and different data models like hierarchical, network, relational, and entity-relationship models.

Uploaded by

Gourav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views269 pages

Database Management System 1

This document provides an introduction to database systems. It discusses what a database and DBMS are, the limitations of file processing systems, different types of databases, and advantages of DBMS over file systems. Specifically, it covers centralized and distributed databases, single-user and multi-user databases, and how DBMS addresses issues like redundancy, data sharing, security, and data integrity. It also introduces the concept of data modeling and different data models like hierarchical, network, relational, and entity-relationship models.

Uploaded by

Gourav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 269

Introduction to

Database System

Chittaranjan Pradhan

Database Management Data & Information

System 1 Limitations of
File-Processing
Systems

Introduction to Database System Database

DBMS

Database Types

Advantages of DBMS
over File System

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
1.1
Introduction to
Data & Information Database System

Chittaranjan Pradhan

Data & Information

Limitations of
File-Processing
Systems

Database
• Data
DBMS
• Raw facts, unprocessed facts
Database Types
• Refers to what is actually stored Advantages of DBMS
over File System
• Information
• Result of processing raw data
• Refers to meaning of the data, understood by the user

Data management focuses on the generation, storage &


retrieval of data

1.2
Introduction to
Limitations of File-Processing Systems Database System

Chittaranjan Pradhan

Data & Information

Limitations of
File-Processing
Systems

Database

DBMS

Database Types

Advantages of DBMS
over File System

• Redundancy problem
• Repetitive data
• Data-inconsistency problem
• Incorrectness of data
• Lack of data integration
• Complex and time consuming

1.3
Introduction to
Database Database System

Chittaranjan Pradhan

• Database is a collection of interrelated data


• Database is a shared, integrated computer structure that Data & Information

stores: Limitations of
File-Processing
• End- user data: raw facts of interest to the end-user Systems

Database
• Meta data: through which the end-user data are integrated
DBMS
& managed. The metadata provides a description of the
Database Types
data characteristics and the set of relationships that link the
Advantages of DBMS
data found within the database over File System

• Database is an organized collection of data of an


organization or enterprise

1.4
Introduction to
DBMS Database System

Chittaranjan Pradhan

• DBMS (Database Management System) is a collection of


programs that manages structure & controls access to the Data & Information

data stored in the database Limitations of


File-Processing
Systems
• It includes tools to add, modify or delete data from the
Database
database, ask questions (or queries) about the data stored DBMS
in the database and produce reports Database Types

• DBMS serves as the intermediary between the user & the Advantages of DBMS
over File System
database

1.5
Introduction to
Database Types Database System

Chittaranjan Pradhan

• Depending on the number of users accessing the


database, a database system may be classified as: Data & Information
• Single-user database system: It supports only one user at Limitations of
File-Processing
a time. When a single-user database runs on a personal Systems
computer, it is also called a desktop database system Database
• Multi-user database system: It supports multiple users at DBMS
the same time. When a multi-user database supports Database Types

relatively small number of users, it is called as a workgroup Advantages of DBMS


over File System
database system. If the database is used by many users
across globe, it is known as enterprise database system
• Depending on the location of the database, a database
system may be classified as:
• Centralized database system: It supports data located at
a single site or single place
• Distributed database system: It supports data distributed
across several different sites. Here, the same database can
be replicated and stored in another computer so that when
ever the original server goes down; the data can be
available to the user from the replicated data from other
servers
1.6
Introduction to
Advantages of DBMS over File System Database System

Chittaranjan Pradhan

Data & Information

Limitations of
File-Processing
• Controlling Redundancy & Inconsistency Systems

Database
• Allows Data Sharing
DBMS
• Restricting Unauthorized Access Database Types

Advantages of DBMS
• Providing Storage Structures for efficient query processing over File System

• Providing Backup & Recovery


• Providing multiple user interfaces
• Enforcing Integrity Constraints
• Solving data isolation
• Providing economies of scaling

1.7
Data Model

Chittaranjan Pradhan

Database Management Data Model


Data Model Basic Building

System 2
Blocks

Hierarchical Model

Network Model

Data Model Relational Model

Entity-
Relationship(ER)
Model

Object-Oriented(OO)
Model

Object-Relational(OR)
Model

Semi-structured Model

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
2.1
Data Model
Data Model
Chittaranjan Pradhan

Data Model
Data Model Basic Building
Blocks

Hierarchical Model
Data Model Network Model

Data model is a collection of conceptual tools for describing Relational Model

data, data relationships, data semantics and consistency Entity-


Relationship(ER)
constraints. That means a data model provides a way to Model

Object-Oriented(OO)
describe the design of a database Model

Object-Relational(OR)
Model
• It is relatively simple representation, usually graphical, of Semi-structured Model
complex real-world data structures
• Data modeling is considered as the most important part of
the database design process

2.2
Data Model
Data Model Basic Building Blocks
Chittaranjan Pradhan

Entity Data Model


Data Model Basic Building
Blocks
An entity is anything about which data are to be collected and
Hierarchical Model
stored. An entity represents a particular type of object in the Network Model
real world Relational Model

Entity-
Relationship(ER)
Entity Set Model

Object-Oriented(OO)
Set of entities of the same type that share the same properties Model

are called as entity sets Object-Relational(OR)


Model

Semi-structured Model

Attribute
An attribute is a characteristic of an entity

Constraints
A constraint is a restriction placed on the data. Constraints are
important because they help to ensure data integrity

2.3
Data Model
Data Model Basic Building Blocks...
Chittaranjan Pradhan
Relationship
A relationship describes an association among entities. Data Model
Data Model Basic Building
Different types of relationship are: Blocks

Hierarchical Model

• One-to-One (1:1) Relationship: Network Model

Relational Model

Entity-
Relationship(ER)
Model

Object-Oriented(OO)
Model

Object-Relational(OR)
• One-to-Many (1:M) Relationship: Model

Semi-structured Model

• Many-to-Many (M:N) Relationship:

2.4
Data Model
Hierarchical Model
Chittaranjan Pradhan

Hierarchical Model Data Model


Data Model Basic Building
Blocks
The hierarchical model was developed in the 1960s to manage
Hierarchical Model
large amount of data for complex manufacturing projects. Its Network Model
basic logical structure is represented by an upside-down tree. Relational Model
The hierarchical structure contains levels of segments. It Entity-
Relationship(ER)
depicts a set of 1:M relationships between a parent and its Model

children segments Object-Oriented(OO)


Model

Object-Relational(OR)
Model

Semi-structured Model

2.5
Data Model
Hierarchical Model...
Chittaranjan Pradhan

Data Model
Data Model Basic Building
Blocks

• Advantages: Hierarchical Model

• Efficient storage for data that have a clear hierarchy Network Model

• Parent/child relationship promotes conceptual simplicity & Relational Model

data integrity Entity-


Relationship(ER)
• It is efficient with 1:M relationships Model

• It promotes data sharing Object-Oriented(OO)


Model
• Disadvantages: Object-Relational(OR)
Model
• It is complex to implement
Semi-structured Model
• It is difficult to manage
• There are implementation limitations, that means it can’t
represent M:N relationships
• There is no DDL and DML
• There is lack of standards

2.6
Data Model
Network Model
Chittaranjan Pradhan

Network Model Data Model


Data Model Basic Building

The network model was created to represent complex data Blocks

Hierarchical Model
relationships more effectively than the hierarchical model, to
Network Model
improve database performance, and to impose a database Relational Model
standard. A user perceives the network model as a collection Entity-
of records in 1:M relationships Relationship(ER)
Model

Object-Oriented(OO)
Model

Object-Relational(OR)
Model

Semi-structured Model

2.7
Data Model
Network Model...
Chittaranjan Pradhan

• Advantages:
Data Model
• It represents complex data relationships better than Data Model Basic Building
Blocks
hierarchical models Hierarchical Model
• It handles more relationship types, such as M: N and
Network Model
multi-parent Relational Model
• Data access is more flexible than hierarchical model Entity-
• Improved database performance Relationship(ER)
Model
• It includes DDL and DML
Object-Oriented(OO)
Model
• Disadvantages:
Object-Relational(OR)
• System complexity limits efficiency Model

• Navigational system yields complex implementation and Semi-structured Model

management
• Structural changes require changes in all application
programs
• Database contains a complex array of pointers that thread
through a set of records
• Put heavy pressure on programmers due the complex
structure
• Networks can become chaotic unless planned carefully

2.8
Data Model
Relational Model
Chittaranjan Pradhan

Relational Model Data Model


Data Model Basic Building
The relational model was introduced by E. F. Codd in 1970. Blocks

This data model is implemented through RDBMS; which is Hierarchical Model

easier to understand and implement Network Model

Relational Model

Entity-
The most important advantage of the RDBMS is its ability to Relationship(ER)
Model
hide the complexities of the relational model from the user.
Object-Oriented(OO)
Another reason for the relational data model’s rise to Model

dominance is its powerful and flexible query language. Object-Relational(OR)


Model
Generally, SQL is used for this purpose Semi-structured Model

2.9
Data Model
Relational Model...
Chittaranjan Pradhan

Data Model
Data Model Basic Building
• Advantages: Blocks

• Changes in a table’s structure do not affect data access or Hierarchical Model

Network Model
application programs
Relational Model
• Tabular view substantially improves conceptual simplicity,
Entity-
thereby promoting easier database design, implementation, Relationship(ER)
management and use Model

• Have referential integrity controls ensure data consistency Object-Oriented(OO)


Model
• RDBMS isolates the end-users from physical level details Object-Relational(OR)
and improves implementation and management simplicity Model

Semi-structured Model
• Disadvantages:
• Conceptual simplicity gives relatively untrained people the
tools to use a good system poorly
• It may promote islands of information problems as
individuals and departments can easily develop their own
applications

2.10
Data Model
Entity-Relationship(ER) Model
Chittaranjan Pradhan

Data Model
Entity-Relationship(ER) Model Data Model Basic Building
Blocks

Peter Chen first introduced the ER data model in 1976; it was Hierarchical Model
the graphical representation of entities and their relationships Network Model

in a database structure that quickly became popular. Thus, the Relational Model

ER-model has become a widely accepted standard for data Entity-


Relationship(ER)
modeling Model

Object-Oriented(OO)
Model
ER models are normally represented in an ER diagram Object-Relational(OR)
Model

Semi-structured Model

2.11
Data Model
Entity-Relationship(ER) Model...
Chittaranjan Pradhan

Data Model
Data Model Basic Building
Blocks

Hierarchical Model
• Advantages: Network Model
• Visual modeling yields exceptional conceptual simplicity Relational Model
• Visual representation makes it an effective communication Entity-
Relationship(ER)
tool Model
• It is integrated with dominant relational model Object-Oriented(OO)
Model
• Disadvantages:
Object-Relational(OR)
• There is limited constraint representation Model

Semi-structured Model
• There is limited relationship representation
• There is no DML
• Loss of information content when attributes are removed
from entities to avoid crowded displays

2.12
Data Model
Object-Oriented(OO) Model
Chittaranjan Pradhan
Object-Oriented(OO) Model
In object-oriented data model, both data and their relationships Data Model
Data Model Basic Building
are contained in a single structure called an object. Like the Blocks

relational model’s entity, an object is described by its factual Hierarchical Model

Network Model
content. But quite unlike an entity, an object includes
Relational Model
information about relationships between the facts within the
Entity-
object, as well as information about its relationships with other Relationship(ER)
Model

Object-Oriented(OO)
Attributes describe the properties of an object. Objects that Model

share similar characteristics are grouped in classes. Thus, a Object-Relational(OR)


Model
class is a collection of similar objects with shared structure Semi-structured Model

(attributes) and methods

2.13
Data Model
Object-Oriented(OO) Model...
Chittaranjan Pradhan

Data Model
Data Model Basic Building
Blocks

Hierarchical Model
• Advantages: Network Model
• Semantic content is added Relational Model

• Support for complex objects Entity-


Relationship(ER)
• Visual representation includes semantic content Model
• Inheritance promotes data integrity Object-Oriented(OO)
Model
• Disadvantages: Object-Relational(OR)
Model
• It is a complex navigational system
Semi-structured Model
• High system overheads slow transactions
• Slow development of standards caused vendors to supply
their own enhancements, thus eliminating a widely accepted
standard

2.14
Data Model
Object-Relational(OR) Model
Chittaranjan Pradhan
Object-Relational(OR) Model
The object-oriented data model is somewhat spherical in Data Model
Data Model Basic Building
nature, allowing access to unique elements anywhere within a Blocks

database structure, with extremely high performance. But, it Hierarchical Model

Network Model
performs extremely poorly when retrieving more than a single
Relational Model
data item
Entity-
The relational data model is best suited for retrieval of groups Relationship(ER)
Model
of data, but can also be used to access unique data items fairly Object-Oriented(OO)
efficiently Model

Object-Relational(OR)
Model
Thus, by combining the features of relational data model and Semi-structured Model

object-oriented data model, object-relational data model was


created

2.15
Data Model
Semi-structured Model
Chittaranjan Pradhan

Data Model
Data Model Basic Building
Blocks

Hierarchical Model

Network Model

Relational Model
Semi-structured Model Entity-
Relationship(ER)
The semi-structured data model permits the specification of Model

data where individual data items of the same type may have Object-Oriented(OO)
Model
different sets of attributes. The XML (Extensible Markup Object-Relational(OR)
Language) is widely used to represent semi-structured data. It Model

Semi-structured Model
supports unstructured data

2.16
3-Level Abstraction of
Database
Chittaranjan Pradhan

Database Management 3-Level Abstraction of


Database

System 3 Mapping and Data


Independence

Database Users

3-Level Abstraction of Database

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
3.1
3-Level Abstraction of
3-Level Abstraction of Database Database
Chittaranjan Pradhan
3-Level Abstraction of Database
The goal of the ANSI/SPARC 3-level abstraction is to separate 3-Level Abstraction of
the user applications and the physical database. It deals with Database
Mapping and Data
the data, the relationship between them and the different Independence

access methods implemented on the database. The logical Database Users

design of a database is called a schema

3.2
3-Level Abstraction of
3-Level Abstraction of Database... Database
Chittaranjan Pradhan

External/View Level 3-Level Abstraction of


Database

The external level includes a number of external schemas or Mapping and Data
Independence

user views. Each external schema or user view describes the Database Users

part of the database that a particular user group is interested in


and hides the details of the database from that user group

Conceptual Level
The conceptual level has a conceptual schema, which
describes the structure of the whole database for a community
of users. The conceptual schema hides the details of physical
storage structures and concentrates on describing entities,
data types, relationships and constraints

It represents global view of the entire database. Thus; for a


database, there is only one conceptual schema available

3.3
3-Level Abstraction of
3-Level Abstraction of Database... Database
Chittaranjan Pradhan

3-Level Abstraction of
Database
Mapping and Data
Independence

Database Users
Internal Level
The internal level has an internal schema, which describes the
physical storage structure of the database system. Like
conceptual schema, there is only one internal schema available
for a database. It is the one which is closest to physical storage

The internal schema not only defines the various stored record
types, but also specifies what indices exist, how stored fields
are represented

3.4
3-Level Abstraction of
Mapping and Data Independence Database
Chittaranjan Pradhan

Mapping and Data Independence


3-Level Abstraction of
In a database system based on the 3-level architecture, each Database

user group refers only to its own external schema. The process Mapping and Data
Independence

of transforming requests and results between different levels Database Users

are called mapping

Conceptual/Internal Mapping
It defines the correspondence between the conceptual view
and the stored database. Physical Data Independence
indicates that the internal schema can be changed without any
change to the conceptual schema

External/Conceptual Mapping
It defines the correspondence between a particular external
view and the conceptual view. Logical Data Independence
indicates that the conceptual schema can be changed without
affecting the existing external schemas
3.5
3-Level Abstraction of
Database Users Database
Chittaranjan Pradhan

3-Level Abstraction of
Database
Different database users are: Mapping and Data
Independence

Naive Users Database Users

They are the normal or unsophisticated users who interact with


the system by invoking application programs that have been
written previously. The typical user interface for naive users is a
form interface, where the user can fill in appropriate fields of
the form

Application Programmers
They are computer professionals who write application
programs to access data from the database. Application
programmers can use different tools to develop user interfaces

3.6
3-Level Abstraction of
Database Users... Database
Chittaranjan Pradhan

Sophisticated Users
3-Level Abstraction of
Database
They interact with the system without creating any application Mapping and Data
Independence
program. Rather, they form their requests in a database query
Database Users
language and submit each such query to a query processor.
Analysts who submit queries to explore data in the database
fall in this category

Specialized Users
They are sophisticated users who write specialized database
applications that don’t fit into the traditional data processing
framework

Database Administrator(DBA)
The person who has central control of the whole database
system is called DBA. The DBA coordinates all the activities of
the database system

3.7
3-Level Abstraction of
Database Users... Database

The roles of DBA are: Chittaranjan Pradhan

• DBA creates the original database schema by executing a


set of DDL statements 3-Level Abstraction of
Database
• DBA defines and controls the access methods for the Mapping and Data
Independence
different users Database Users
• DBA carries out changes to the schema and physical
organization to reflect the changing needs of the
organization, or to alter the physical organization to
improve performance
• By granting different types of authorization, DBA can
regulate which parts of the database various users can
access
• DBA specifies the different types of constraints to different
tables or objects
• DBA is responsible for the periodically backing up the
database
• DBA ensures that enough free disk space is available for
normal operations and upgrading disk space as required
• DBA monitors the jobs running on the database and
ensures that the performance is not degraded by very
expensive tasks submitted by some users 3.8
Database Architecture

Chittaranjan Pradhan

Database Management Data Storage and


Querying

System 4
Storage Manager
Query Processor

Database Architecture

Database Architecture Application


Architecture

Disadvantages of
Database Processing

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
4.1
Database Architecture
Data Storage and Querying
Chittaranjan Pradhan

The functional components of a database system can be


broadly divided into: Data Storage and
Querying
• The storage manager is important because databases Storage Manager
Query Processor

typically require a large amount of storage space Database Architecture

• The query processor is important because it helps the Application


Architecture
database system simplify and facilitate access to data Disadvantages of
Database Processing
The overall computer system consists of four modules as:
Hardware, Operating system, File management system and
Application program

4.2
Database Architecture
Storage Manager
Chittaranjan Pradhan

Storage Manager
Data Storage and
A storage manager is a program module that provides the Querying

interface between the low-level data stored in the database and Storage Manager
Query Processor

the application programs and queries submitted to the system. Database Architecture
The storage manager is responsible for the interaction with the Application
Architecture
file manager. Storage manager is responsible for storing,
Disadvantages of
retrieving and updating data in the database Database Processing

The storage manager components include:


• Authorization and Integrity Manager:This module tests
for the satisfaction of integrity constraints and checks the
authority of users to access data
• Transaction Manager:Transaction manager ensures that
the database remains in a consistent (correct) state
despite system failures and concurrent transaction
executions proceed without conflicting
• File Manager:This module manages the allocation of
space on disk storage and data structures used to
represent information stored on the disk
4.3
Database Architecture
Storage Manager...
Chittaranjan Pradhan

Data Storage and


• Buffer Manager:Buffer manager is responsible for Querying

fetching data from the disk storage into main memory. The Storage Manager
Query Processor

buffer manager is a critical part of the database system Database Architecture

Application
The storage manager implements several data structures as Architecture

part of the physical system implementations: Disadvantages of


Database Processing
• Data files:These are files in the physical memory used to
store the database itself
• Data Dictionary:Data dictionary stores the metadata
(data about data) that provides the information about the
definitions of the data items and their relationships,
authorizations, and usage statistics. In addition, any
changes made to the physical structure of the database
are automatically recorded in the data dictionary
• Indices:Indices are used to provide faster access to data
items stored in the physical storage

4.4
Database Architecture
Query Processor
Chittaranjan Pradhan

Query Processor Data Storage and


Querying
Storage Manager
The work of query processor is to execute the query Query Processor

successfully Database Architecture

Application
Architecture
The major components of query processor include:
Disadvantages of
• DDL Interpreter:This is the interpreter used to interpret Database Processing

DDL statements and records the definitions in the data


dictionary
• DML Compiler:DML compiler translates the DML
statements in a query language into an evaluation plan
consisting of low-level instructions that the query
evaluation engine understands. When a user wants to
perform a DML operation, the data dictionary has to be
checked for the validation purpose
• Query Evaluation Engine:This module executes the
low-level instructions generated by the DML compiler

4.5
Database Architecture
Database Architecture
Chittaranjan Pradhan
The overall database architecture is:
Data Storage and
Querying
Storage Manager
Query Processor

Database Architecture

Application
Architecture

Disadvantages of
Database Processing

4.6
Database Architecture
Application Architecture
Chittaranjan Pradhan

Application Architecture Data Storage and


Querying
Client machines are those on which the remote database users Storage Manager
Query Processor
work. Server machines are those on which the database
Database Architecture
system runs Application
Architecture

Disadvantages of
• 2-Tier Architecture: Here, the application is partitioned Database Processing

into a component that resides at the client machine, which


invokes database system functionality at the server
machine through query language. The two tiers are: Data
server and Client application
• 3-Tier Architecture: Here, the client machine acts as a
front end and doesn’t contain any direct database calls.
The client end communicates with an application server,
usually via a form interfaces. The application server in turn
communicates with a database system to access data.
The 3-tier applications are more appropriate for large
applications, and the applications that run on the web

4.7
Database Architecture
Application Architecture...
Chittaranjan Pradhan

Data Storage and


Querying
Storage Manager
Query Processor

Database Architecture

Application
Architecture

Disadvantages of
Database Processing

4.8
Database Architecture
Disadvantages of Database Processing
Chittaranjan Pradhan
The major disadvantages are:
• Larger file size: In order to support all the complex
Data Storage and
functions that it provides to users, a DBMS must be a Querying
Storage Manager
large program that occupies a great amount of disk space Query Processor

as well as a substantial amount of internal memory Database Architecture

Application
• Increased Complexities: The complexity and breadth of Architecture

the functions provided by a DBMS make it a complex Disadvantages of


Database Processing
product. Users of the database system must learn a great
deal to understand the features of the system in order to
take full advantages of it
• Greater Impact of Failure: If several users are sharing
the same database, a failure on the part of any one user
that damages the database in some way might affect all
the other users connected
• More difficult recovery: The database must first be
restored to the condition it was in when it was last known
to be correct, any updates made by users since that time
must be redone. The greater the number of users involved
in updating the database, the more complicated this task
becomes 4.9
ER Modeling

Chittaranjan Pradhan

Database Management Overview of the


Database Design
Process

System 5 Entity-
Relationship(ER)
Model

ER Modeling Attribute Types

Mapping Cardinality
Representation

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
5.1
ER Modeling
Overview of the Database Design Process
Chittaranjan Pradhan

Overview of the
Database Design
• The initial phase of database design is to characterize fully Process

the data needs of the prospective database users. It Entity-


Relationship(ER)
usually involves in textual description Model

Attribute Types
• Next, the designer chooses a data model and, by applying
Mapping Cardinality
the concepts of the chosen data model, translates these Representation

requirements into a conceptual schema of the database.


The ER model is typically used to represent the
conceptual design
• The designer reviews the schema to confirm that all data
requirements are satisfied and are not in conflict with one
another
• At this stage of conceptual design, the designer can
review the schema to ensure it meets all the functional
requirements

5.2
ER Modeling
Overview of the Database Design Process...
Chittaranjan Pradhan

Overview of the
Database Design
Process

Entity-
• The process of moving from an abstract data model to the Relationship(ER)
Model
implementation of the database proceeds in two final Attribute Types
design phases: Mapping Cardinality
Representation
• In the logical design phase, the designer maps the
high-level conceptual schema onto the implementation data
model of the database system that will be used. The
implementation data model is typically the Relational data
model
• Finally, the designer uses the resulting system-specific
database schema in the subsequent physical-design phase,
in which the physical features of the database are specified

5.3
ER Modeling
Entity-Relationship(ER) Model
Chittaranjan Pradhan

Overview of the
Entity-Relationship(ER) Model Database Design
Process

The ER model was developed to facilitate the database design Entity-


Relationship(ER)
by allowing specification of an enterprise schema that Model

represents the overall logical structure of a database. The ER Attribute Types

model is very much useful in mapping the meaning and Mapping Cardinality
Representation
interactions of real-world enterprises onto a conceptual
schema

Entities
An entity is a thing or object in the real world that is
distinguishable from all other objects, i.e. an entity is an object
of interest to the end user. The set of similar types of entities is
called entity set; which is represented by a rectangle
containing the entity set’s name. The entity set name, a noun,
is usually written in all capital letters

5.4
ER Modeling
Entity-Relationship(ER) Model...
Chittaranjan Pradhan
Attributes
Attributes are characteristics of entities. Attributes are Overview of the
Database Design
represented by ovals and are connected to the respective Process

entity set with lines. In the conceptual modeling, the value of an Entity-
Relationship(ER)
attribute comes from a domain of possible values Model

Attribute Types

Relationships Mapping Cardinality


Representation

In modeling, the association between entities are referred to as


relationship. The relationship name is a verb. A relationship set
is a set of relationships of the same type. Relationship sets are
represented by diamonds and are connected to the participant
entity sets

NULL Values
An attribute takes a NULL value when an entity doesn’t have a
value for it. The NULL values may indicate not applicable, i.e.
the value doesn’t exist for the entity. NULL can also designate
that an attribute value is unknown. An unknown value may be
either missing or not known. NULL value is an entry in all the
domains 5.5
ER Modeling
Attribute Types
Chittaranjan Pradhan

Simple and Composite attributes


Overview of the
An attribute that has a discrete factual value and cannot be Database Design
Process
meaningfully subdivided is called an atomic or simple Entity-
attribute.On the other hand, a composite attribute can be Relationship(ER)
Model
meaningfully subdivided into smaller subparts (i.e. simple Attribute Types

attributes) with independent meaning Mapping Cardinality


Representation

5.6
ER Modeling
Attribute Types...
Chittaranjan Pradhan

Single-valued and Multi-valued attributes Overview of the


Database Design
Most attributes have a single value for a particular entity and Process

Entity-
are referred to as single-valued attribute. However, attributes Relationship(ER)
Model
that can have more than one value are known as multi-valued
Attribute Types
attributes. It is represented by double oval Mapping Cardinality
Representation

5.7
ER Modeling
Attribute Types...
Chittaranjan Pradhan

Overview of the
Stored and Derived attributes Database Design
Process
The attribute with independent existence is called as stored Entity-
attribute where as the attribute, whose value is depending on Relationship(ER)
Model
other stored attribute, is called as derived attribute. The derived Attribute Types
attribute is represented by the dotted oval Mapping Cardinality
Representation

5.8
ER Modeling
Attribute Types...
Chittaranjan Pradhan

Overview of the
Database Design
Process

Entity-
Descriptive attributes Relationship(ER)
Model

A relationship may also have attributes called descriptive Attribute Types

attributes for representing the description about the association Mapping Cardinality
Representation

5.9
ER Modeling
Mapping Cardinality Representation
Chittaranjan Pradhan

M:N relationship
Overview of the
Database Design
An entity in A is associated with any number (zero or more) of Process

entities in B and vice versa Entity-


Relationship(ER)
Model

Attribute Types

Mapping Cardinality
Representation

5.10
ER Modeling
Mapping Cardinality Representation...
Chittaranjan Pradhan

1:M relationship
Overview of the
An entity in A is associated with any number (zero or more) of Database Design
Process
entities in B; an entity in B, however, is associated with no Entity-
more than 1 entity set of A Relationship(ER)
Model

Attribute Types

Mapping Cardinality
Representation

5.11
ER Modeling
Mapping Cardinality Representation...
Chittaranjan Pradhan

1:1 relationship
Overview of the
Database Design
An entity in A is associated with no more than 1 entity of B; and Process

an entity in B is associated with no more than 1 entity of A Entity-


Relationship(ER)
Model

Attribute Types

Mapping Cardinality
Representation

5.12
ER Modeling...

Chittaranjan Pradhan

Database Management Keys


Keys for Relationship sets

System 6 Relationship Types

Participation
Constraints

ER Modeling... l..h Representation


Alternate Mapping
Cardinality Representation

Strong Entity sets and


Weak Entity sets

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
6.1
ER Modeling...
Keys
Chittaranjan Pradhan

Keys
Keys
A key allows us to identify a set of attributes that suffice to Keys for Relationship sets

Relationship Types
distinguish entities from each other
Participation
Constraints
l..h Representation
• A key is a property of the entity set, rather than of the Alternate Mapping
Cardinality Representation
individual entities Strong Entity sets and
Weak Entity sets
• A super key is a set of one or more attributes that allow us
to identify uniquely an entity in an entity set
• The minimal super keys are called candidate keys
• Primary key is a candidate key that is chosen by the
database designer as the principal means of identifying
entities within an entity set. The primary key can be
represented by underlying the attribute name. The primary
key should be chosen such that its attributes are never or
very rarely changed
• The remaining candidate keys except the primary key are
called as alternate keys
6.2
ER Modeling...
Keys...
Chittaranjan Pradhan
• If none of the columns is a candidate for the primary key in
a table, sometimes database designers use an extra Keys
column as a primary key instead of using a composite key. Keys for Relationship sets

Relationship Types
Such key is known as the surrogate key
Participation
• Foreign key is the set of attributes which is used for Constraints
l..h Representation
referring to another entity set having the primary key. In Alternate Mapping
Cardinality Representation
ER diagram, foreign key can not be represented Strong Entity sets and
Weak Entity sets

6.3
ER Modeling...
Keys for Relationship sets
Chittaranjan Pradhan

Keys
Keys for Relationship sets

Keys for Relationship sets Relationship Types

Participation
Let R be a relationship set involving entity sets E1 , E2 , ... En . Constraints
l..h Representation
Let PK(Ei ) denotes the set of attributes that forms the primary Alternate Mapping
Cardinality Representation
key for entity set Ei Strong Entity sets and
Weak Entity sets
• If the relationship set R has no descriptive attributes
associated with it, then the set of attributes PK(E1 ) U
PK(E2 ) U ... U PK(En ) describes an individual relationship
in set R
• If the relationship set R has attributes a1 , a2 ... am
associated with it, then the set of attributes PK(E1 ) U
PK(E2 ) U ... U PK(En ) U {a1 ,a2 ,...am } describes an
individual relationship in set R

6.4
ER Modeling...
Keys for Relationship sets...
Chittaranjan Pradhan
M:N relationship
The primary key of the relationship set consists of the union of Keys
Keys for Relationship sets
the primary keys of the entity sets Relationship Types

Participation
Constraints
l..h Representation
Alternate Mapping
Cardinality Representation

Strong Entity sets and


Weak Entity sets

1:M relationship
The primary key of the relationship set is the primary key of the
many side entity set

6.5
ER Modeling...
Keys for Relationship sets...
Chittaranjan Pradhan

Keys
Keys for Relationship sets

Relationship Types

1:1 relationship Participation


Constraints
l..h Representation
The primary key of the relationship set is either the primary key Alternate Mapping
Cardinality Representation
of any entity set
Strong Entity sets and
Weak Entity sets

6.6
ER Modeling...
Relationship Types
Chittaranjan Pradhan
Relationship Types
A relationship type is a meaningful association among entity Keys
Keys for Relationship sets
types. The degree of a relationship type is defined as the Relationship Types
number of entity sets participating in that relationship type Participation
Constraints
l..h Representation
Binary relationship Alternate Mapping
Cardinality Representation

A relationship type is said to be binary when two entity sets are Strong Entity sets and
Weak Entity sets
involved

Ternary relationship
Relationship types that involve three entity sets are defined as
ternary relationships

6.7
ER Modeling...
Relationship Types...
Chittaranjan Pradhan

Quaternary relationship
A relationship of degree four can be referred to as a quaternary Keys
Keys for Relationship sets
relationship Relationship Types

Participation
Constraints
l..h Representation
Alternate Mapping
Cardinality Representation

Strong Entity sets and


Weak Entity sets

Recursive relationship
The participation of an entity set in a relationship type can be
indicated by its role name. When used in recursive relationship
types, role names describe the functionality of the participation

6.8
ER Modeling...
Participation Constraints
Chittaranjan Pradhan
Participation Constraints
The participation constraint for an entity set in a binary Keys
Keys for Relationship sets
relationship type is based on whether an entity of that entity set Relationship Types
needs to be related to an entity of the other entity set through Participation
this relationship type Constraints
l..h Representation
Alternate Mapping
Cardinality Representation
• Total participation: If, in order to exist, every entity must Strong Entity sets and
participate in the relationship, then participation of the Weak Entity sets

entity set in that relationship type is total or mandatory.


The total participation is represented by double lines
• Partial participation: If an entity can exist without
participating in the relationship, then participation of the
entity type in that relationship type is partial or optional

6.9
ER Modeling...
l..h Representation
Chittaranjan Pradhan

l..h Representation
An edge between an entity set and a binary relationship set Keys
Keys for Relationship sets
can have an associated minimum and maximum cardinality; Relationship Types
shown as l..h, where l is the minimum and h is the maximum Participation
Constraints
cardinality l..h Representation
Alternate Mapping
Cardinality Representation

• A minimum value of 1 indicates total participation of the Strong Entity sets and
Weak Entity sets
entity set in the relationship set
• A maximum value of 1 indicates that the entity participates
in at most one relationship, while a maximum value *
indicates no limit
• 1..* indicates total participation or double line

6.10
ER Modeling...
Alternate Mapping Cardinality Representation
Chittaranjan Pradhan
Alternate Mapping Cardinality Representation
Keys
Keys for Relationship sets

Relationship Types

Participation
Constraints
l..h Representation
Alternate Mapping
Cardinality Representation

Strong Entity sets and


Weak Entity sets

6.11
ER Modeling...
Strong Entity sets and Weak Entity sets
Chittaranjan Pradhan

Strong Entity sets and Weak Entity sets Keys


Keys for Relationship sets

An entity set where the entities have independent existence Relationship Types

(that is, each entity is unique) is referred to as a strong or base Participation


Constraints
entity set. On the other hand, the entity set that does not have l..h Representation
Alternate Mapping
independent existence, that is, an entity set that does not have Cardinality Representation

its own unique identifier is known as weak entity set Strong Entity sets and
Weak Entity sets

• For a weak entity set to be meaningful, it must be


associated with another strong entity set called
identifying or owner entity set
• The relationship associating the weak entity set with the
identifying entity set is called the identifying relationship
• The identifying relationship is usually many-to-one from
the weak entity set to the identifying entity set and the
participation of the weak entity set in the relationship is
total participation

6.12
ER Modeling...
Strong Entity sets and Weak Entity sets...
Chittaranjan Pradhan

• An attribute in a weak entity set, which in conjunction with Keys


a unique identifier of the parent entity set in the identifying Keys for Relationship sets

Relationship Types
relationship type uniquely identifies weak entities, is called
Participation
the partial key of the weak entity set and is denoted by a Constraints

dotted underline. The partial key of a weak entity set is l..h Representation
Alternate Mapping

sometimes referred to as a discriminator Cardinality Representation

Strong Entity sets and


• The primary key of a weak entity set is formed by the Weak Entity sets

primary key of the identifying entity set, plus the weak


entity set’s discriminator

6.13
ER Design Issues

Chittaranjan Pradhan

Database Management ER Design Issues

ER Design

System 7 Methodologies

ER Design Issues

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
7.1
ER Design Issues
ER Design Issues
Chittaranjan Pradhan
ER Design Issues
ER design issues need to be discussed for better ER- design ER Design Issues

ER Design
Methodologies
1. Use of Entity set vs. Attributes
In the real world situations, sometimes it is difficult to select the
property as an attribute or an entity set

7.2
ER Design Issues
ER Design Issues...
Chittaranjan Pradhan

2. Use of Entity sets vs. Relationship sets


ER Design Issues
Sometimes, an entity set can be better expressed in ER Design
relationship set. Thus, it is not always clear whether an object Methodologies

is best expressed by an entity set or a relationship set

7.3
ER Design Issues
ER Design Issues...
Chittaranjan Pradhan

3. Binary vs. n-ary relationship sets ER Design Issues

ER Design
Relationships in databases are often binary. Some Methodologies

relationships that appear to be non-binary could actually be


better represented by several binary relationships

7.4
ER Design Issues
ER Design Issues...
Chittaranjan Pradhan

ER Design Issues

ER Design
Methodologies
It is always possible to replace a non-binary relationship set by
a number of distinct binary relationship sets. For example,
consider a ternary relationship R associated with three entity
sets A, B and C. We can replace the relationship set R by an
entity set E and create three relationship sets as:

• RA , relating E and A
• RB , relating E and B
• RC , relating E and C
If the relationship set R had any attributes, these are assigned
to entity set E. A special identifying attribute is created for E

7.5
ER Design Issues
ER Design Issues...
Chittaranjan Pradhan

ER Design Issues

ER Design
4. Placement of Relationship Attributes Methodologies

The cardinality ratio of a relationship can affect the placement


of relationship attributes:
• One-to-Many: Attributes of 1:M relationship set can be
repositioned to only the entity set on the many side of the
relationship
• One-to-One: The relationship attribute can be associated
with either one of the participating entities
• Many-to-Many: Here, the relationship attributes can not
be represented to the entity sets; rather they will be
represented by the entity set to be created for the
relationship set

7.6
ER Design Issues
ER Design Methodologies
Chittaranjan Pradhan

ER Design Issues

ER Design Methodologies ER Design


Methodologies

The guidelines that should be followed while designing an ER


diagram are discussed below:
• Recognize entity sets
• Recognize relationship sets and participating entity sets
• Recognize attributes of entity sets and attributes of
relationship sets
• Define binary relationship types and existence
dependencies
• Define general cardinality, constraints, keys, and
discriminators
• Design diagram

7.7
Enhanced ER-Model

Chittaranjan Pradhan

Database Management Specialization

Generalization

System 8 Constraints on
Generalization/Specia-
lization

Enhanced ER-Model Aggregation

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
8.1
Enhanced ER-Model
Specialization
Chittaranjan Pradhan

Specialization

Generalization
Specialization
Constraints on
Generalization/Specia-
The process of designating sub groupings within an entity set is lization

called Specialization. An entity set may be specialized by more Aggregation

than one distinguishing features. ER-design, specialization is


depicted by a Triangle component labeled ISA (is a)

8.2
Enhanced ER-Model
Specialization...
Chittaranjan Pradhan

We can apply specialization repeatedly to refine a design


Specialization
scheme
Generalization

Constraints on
Generalization/Specia-
lization

Aggregation

8.3
Enhanced ER-Model
Generalization
Chittaranjan Pradhan

Generalization Specialization

The commonality can be expressed by Generalization, which is Generalization

Constraints on
a containment relationship that exists between a higher-level Generalization/Specia-
lization
entity set and one or more low-level entity sets
Aggregation

• To create a generalization, the attributes must given a


common name and represented with the higher-level entity
• Generalization is a simple inversion of specialization
• Specialization adopts top-down approach, while
Generalization adopts bottom-up approach
• A crucial property of the higher-level and lower-level
entities created by specialization and generalization is
attribute inheritance
• A lower-level entity set (or subclass) also inherits
participation in the relationship sets in which its
higher-level entity (or superclass) participates

8.4
Enhanced ER-Model
Generalization...
Chittaranjan Pradhan

Specialization

Generalization

Constraints on
Generalization/Specia-
lization

Aggregation

8.5
Enhanced ER-Model
Constraints on Generalization/Specialization
Chittaranjan Pradhan

Specialization

Generalization

Constraints on
a. Condition defined or not Generalization/Specia-
lization

• Condition-defined: In condition defined lower-level entity Aggregation

sets, membership is evaluated on the basis of whether or


not an entity satisfies an explicit condition or predicate.
Since all the lower-level entities are evaluated on the basis
of the same attribute, this type of generalization is also
said to be attribute-defined
• User-defined: User-defined lower-level entity sets are not
constrained by a membership condition; rather, the
database user assigns entities to a given entity set

8.6
Enhanced ER-Model
Constraints on Generalization/Specialization...
Chittaranjan Pradhan

b. Disjoint or Overlapping
Specialization

Generalization
• Disjoint: A Disjointness constraint requires that an entity
Constraints on
belong to only one lower-level entity set Generalization/Specia-
lization
• Overlapping: In overlapping generalizations, the same Aggregation
entity may belong to more than one lower-level entity set
within a single generalization
• Lower-level entity overlap is the default case. A
disjointness constraint must be placed explicitly on a
generalization. This is done by adding the word disjoint
next to the ISA symbol

8.7
Enhanced ER-Model
Constraints on Generalization/Specialization...
Chittaranjan Pradhan

Specialization

c. Completeness Constraint Generalization

Constraints on
Completeness constraint on a generalization/specialization Generalization/Specia-
lization
specifies whether or not an entity in the higher-level entity set Aggregation
must belong to at least one of the lower-level entity sets within
the generalization/specialization
• Total generalization/specialization: Each higher-level
entity must belong to a lower-level entity set
• Partial generalization/specialization: Some higher-level
entities may not belong to any lower-level entity set
• Partial generalization is the default. Total generalization in
an ER diagram can be specified by using a double line to
connect the box representing the higher-level entity set to
the triangle symbols

8.8
Enhanced ER-Model
Aggregation
Chittaranjan Pradhan

Specialization
Aggregation
Generalization

One limitation of the ER model is that it can not express Constraints on


Generalization/Specia-
relationship among relationships lization

Aggregation

8.9
Enhanced ER-Model
Aggregation...
Chittaranjan Pradhan

Aggregation
Specialization

Aggregation is an abstraction through which relationships are Generalization

treated as higher-level entities. Thus, aggregation allows us to Constraints on


Generalization/Specia-
treat a relationship set as an entity set for the purposes of lization

participation in (other) relationships Aggregation

8.10
Relational Model

Chittaranjan Pradhan

Database Management Relational Model

Relational Database

System 9 Relational Data


Integrity

Relational Model Database Languages

CODD’s Rules

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
9.1
Relational Model
Relational Model
Chittaranjan Pradhan

• Relational data model is the primary data model for Relational Model

Relational Database
commercial data- processing applications
Relational Data
• A relational database consists of a collection of tables, Integrity

Database Languages
each of which is assigned a unique name
CODD’s Rules
• A row in a table represents a relationship among a set of
values. Thus, a table is an entity set and a row is an entity
• The columns or properties are called attributes
• For each attribute, there is a set of permitted values, called
the domain of that attribute. Same domain can be shared
by more than one attribute
• Degree is the number of attributes in the relation/ table,
where as Cardinality is the number of tuples or rows in the
relation/table
• The attribute values are required to be atomic, i.e.
indivisible

9.2
Relational Model
Relational Model...
Chittaranjan Pradhan

Relational Model
• Let D1 , D2 , and D3 are the domains. Any row of the table
Relational Database
consists of a 3-tuple (v1 , v2 , v3 ) where v1 ∈ D1 , v2 ∈ D2
Relational Data
and v3 ∈ D3 . Thus, the table will contain only a subset of Integrity

the set of all possible rows. Therefore, the table is a subset Database Languages

of D1 x D2 x D3 CODD’s Rules

• Each attribute of a relation has a unique name


• NULL Value is a domain value which is a member of any
possible domain
• Database Schema is the logical design of the database. If
(a1 , a2 ...an ) be the attributes, then the relation schema will
be R=(a1 ,a2 ...an )
• Database Instance is the snapshot of the data in the
database at a given instant of time
• Relation is denoted by lower case names and Relation
Schema is the name beginning with an uppercase letter

9.3
Relational Model
Relational Database
Chittaranjan Pradhan

Relational Model

Relational Database

Relational Data
Integrity

Database Languages
Relational Database
CODD’s Rules
Relational database is a database consisting of multiple
relations or tables. The information about an enterprise is
broken up into parts, with each relation storing one part of the
information

The normalization process deals with how to design relational


schemas

9.4
Relational Model
Relational Data Integrity
Chittaranjan Pradhan

Relational Data Integrity


Candidate key is an attribute or set of attributes that can Relational Model

Relational Database
uniquely identify a row or tuple in a table. Let R be the relation
Relational Data
with attributes a1 , a2 ... an . The set of attributes of R is said to Integrity

be a candidate key of R iff the following two properties holds: Database Languages

CODD’s Rules
• Uniqueness: At any given time, no two distinct tuples or
rows of R have the same value for ai , the same value for aj
...an
• Minimality: No proper subset of the set (ai , aj ... an ) has
the uniqueness property

The major types of integrity constraints are:


1. Domain Constraints

• All the values that appear in a column of a relation must be


taken from the same domain
• This constraint can be applied by specifying a particular
data type to a column
9.5
Relational Model
Relational Data Integrity...
Chittaranjan Pradhan

2. Entity Integrity
Relational Model

• The entity integrity rule is designed to assure that every Relational Database

Relational Data
relation has a primary key, and that the data values for that Integrity

primary key are all valid Database Languages

CODD’s Rules
• Usually, the primary key of each relation is the first column
• Entity integrity guarantees that every primary key attribute
is NOT NULL
• Primary key performs the unique identification function in a
relational model

3. Referential Integrity

• In relational data model, associations between tables are


defined by using foreign keys
• A referential integrity constraint is a rule that maintains
consistency among the rows of two relations

9.6
Relational Model
Relational Data Integrity...
Chittaranjan Pradhan

Relational Model
3. Referential Integrity...
Relational Database

Relational Data
• The rule states that if there is a foreign key in one relation, Integrity

either each foreign key value must match a primary key Database Languages

value in the other table or else the foreign key value must CODD’s Rules

be NULL
• A foreign key that references its own relation is known as
recursive foreign key
• The linking between the foreign key and primary key
allows a set of relations to form an integrated database

4. Operational Constraints

• These are the constraints enforced in the database by the


business rules or real world limitations

9.7
Relational Model
Database Languages
Chittaranjan Pradhan
DDL (Data Definition Language)
Relational Model
• DDL is used to define the conceptual schema. The
Relational Database
definition includes the information of all the entity sets and Relational Data
their associated attributes as well as the relationships Integrity

Database Languages
between the entity sets
CODD’s Rules
• The data values stored in the database must specify
certain consistency constraints. The database systems
check these constraints every time the database is
updated
• The output of the DDL is placed in the Data Dictionary
which contains the metadata (data about data)
• The data dictionary is considered to be a special type of
table, which can only be accessed and updated by the
database system itself
• The database system consults the data dictionary, before
querying or modifying the actual data, for the validation
purpose
• CREATE, ALTER, DROP, RENAME & TRUNCATE
9.8
Relational Model
Database Languages...
Chittaranjan Pradhan

DML (Data Manipulation Language)


Relational Model
• DML is used to manipulate data in the database Relational Database

• A query is a statement in the DML that requests the Relational Data


Integrity
retrieval of data from the database Database Languages

• SELECT, INSERT, UPDATE & DELETE CODD’s Rules

DCL (Data Control Languages)

• DCL allows in changing the permissions on database


structures
• GRANT & REVOKE

TCL (Transaction Control Language)

• TCL allows permanently recording the changes made to


the rows stored in a table or undoing such changes
• COMMIT, ROLLBACK & SAVEPOINT
9.9
Relational Model
CODD’s Rules
Chittaranjan Pradhan
Codd’s rules are a set of 12 rules proposed by E. F. Codd
designed to define what is required from a database
Relational Model
management system in order for it to be considered relational,
Relational Database
i.e. RDBMS. Any database that satisfies even six rules may be Relational Data
categorized as RDBMS Integrity

Database Languages
Rule0 CODD’s Rules

A relational system should be able to manage databases,


entirely through its relational capabilities

Rule1: Information representation


The entire information is explicitly and logically represented by
the data values of the tables in the relational data model

Rule2: Guaranteed access


In relational model, at each cell, i.e. the interaction of each row
and column, it will have one and only one value of data (or
NULL value). Each value of data must be addressable via the
combination of a table name, primary key value and the
column name
9.10
Relational Model
CODD’s Rules...
Chittaranjan Pradhan

Rule3: Systematic treatment of NULL values Relational Model

NULL values are supported in fully relational DBMS for Relational Database

Relational Data
representing missing information and inapplicable information Integrity
in a systematic way independent of data type Database Languages

CODD’s Rules

Rule4: Database description rule


The database description is represented at the logical level in
the same way as ordinary data, so that authorized users can
apply the same relational language to its interrogation as they
apply to the regular data. This means, the RDBMS should have
a data dictionary

Rule5: Comprehensive data sub-language


The RDBMS should have its own extension of SQL. The SQL
should support Data Definition, View Definition, Data
Manipulation, Integrity Constraint, and Authorization

9.11
Relational Model
CODD’s Rules...
Chittaranjan Pradhan
Rule6: Views updation
All views that are theoretically updatable are also updatable by Relational Model

the system. Similarly, the views which are theoretically Relational Database

Relational Data
non-updatable are also non-updatable by the database system Integrity

Database Languages

Rule7: High-level update, insert, deletes CODD’s Rules

A RDBMS should not only support retrieval of data as


relational sets, but should also support insertion, updation and
deletion of data as a relational set

Rule8: Physical data independence


Application programs and terminal activities are not disturbed if
any changes are made either in storage representations or
access methods

Rule9: Logical data independence


User programs and the user should not be aware of any
changes to the structure of the tables such as the addition of
extra columns
9.12
Relational Model
CODD’s Rules...
Chittaranjan Pradhan
Rule10: Distribution independence
A relational DBMS has distribution independence. The RDBMS Relational Model

may spread across more than one system and across several Relational Database

Relational Data
networks. However to the end-user, the tables should appear Integrity

no different to those that are local Database Languages

CODD’s Rules

Rule11: Integrity rule


Integrity rules must be supported by the relational data
sub-language; they can be stored in the catalogue and not in
the application program. Entity integrity: no component of a
primary key may have a NULL value. Referential integrity: for
every unique non-null ’foreign key’ values in the database,
there should be a matching primary key value from the same
domain

Rule12: Data integrity cannot be subverted


If a relational system has a low-level language, that low level
cannot be used to subvert or bypass the integrity rules and
constraints expressed in the higher level relational language
9.13
Conversion of ER
model to Relational
Model
Chittaranjan Pradhan

Database Management
Conversion of ER

System 10 model to Relational


Model

Conversion of ER model to
Relational Model

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
10.1
Conversion of ER
Conversion of ER model to Relational Model model to Relational
Model
Chittaranjan Pradhan
Conversion of ER model to Relational Model
A database that conforms to an ER diagram schema can be Conversion of ER
model to Relational
represented by a collection of relational schemas. Both the ER Model

model and Relational data model are abstract, logical


representations of real-world enterprises

1. Representation of Strong Entity sets


A strong entity set reduces to a schema with the same
attributes. The primary key of the entity set serves as the
primary key of the resulting schema

Loan = (loan_no, amount)


10.2
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan

2. Representation of Weak Entity sets Conversion of ER


model to Relational
Model
A weak entity set becomes a table that includes a column for
the primary key of the identifying strong entity set. The primary
key is constructed by the collection of foreign key and partial
key

Loan = (loan_no, amount)


Payment = (loan_no, payment_no, payment_date,
payment_amt)

10.3
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan
3. Representation of Relationship sets
Conversion of ER
model to Relational
3.a. Binary M:N Model

Union of the primary key attributes from the participating entity


sets becomes the primary key of the relationship

Customer = (cid, cname, address)


Loan = (loan_no, amount)
Borrow = (cid, loan_no)
If borrow_date is mentioned as descriptive attribute, then
Borrow = (cid, loan_no, borrow_date)
10.4
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan
3.b. Binary M:1/1:M
Construct two tables, one for the entity set at 1 side and Conversion of ER
model to Relational
another for entity set at M side, add the descriptive attributes Model

and a reference of the primary key of 1 side to the entity set at


M side

Stud = (roll, name, branch)


Library = (bid, bname, price, roll)
The foreign key can be represented by specifying the name as:
Library = (bid, bname, price, borrowing_roll)
If borrow_date is the descriptive attribute, then
Library = (bid, bname, price, borrowing_roll, borrow_date)
10.5
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan
3.c. Binary 1:1
Construct two tables. In this case, either side can be chosen to Conversion of ER
act as the many side. That is, extra attributes can be added to model to Relational
Model
either of the tables corresponding to the two entity sets, but not
at the same time

Employee = (eid, ename, address, did)


Department = (did, dname, location)
If it is required to mention the relationship name, then
Employee = (eid, ename, address, manager_did)
If department entity set will be considered as many side, then
Employee = (eid, ename, address)
Department = (did, dname, location, manager_eid)
10.6
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan

4. Representation of Recursive Relationship sets


Conversion of ER
model to Relational
Two tables will be constructed; one for entity set and one for Model

relationship set

Employee = (eid, ename, address)


Works_for = (mgrid, workerid)
This ER diagram can also be represented by using a single
relation schema. In such cases, the schema contains a foreign
key for each tuple in the original entity set
Employee = (eid, ename, address, manager_id)

10.7
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan

Conversion of ER
model to Relational
Model
5. Representation of Composite attributes
The composite attributes are flattened out by creating a
separate attribute for each of its parts

Customer = (cid, name, address_street, address_city,


address_pin)

10.8
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan

Conversion of ER
model to Relational
6. Representation of Multi-valued attributes Model

A multi-valued attribute M of an entity set E is represented by a


separate schema E_M as E_M(primary key of E,M)

Employee = (eid, name, address)


Employee_phone_no = (eid, phone_no)

10.9
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan
7. Representation of Generalization/Specialization
In case of generalization/specialization-related ER diagram, Conversion of ER
model to Relational
one schema will be constructed for the generalized entity set Model

and the schemas for each of the specialized entity sets

Person = (person_id, name, address)


Employee = (person_id, salary)
Customer = (person_id, credit_rating)
10.10
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model
Chittaranjan Pradhan

Conversion of ER
model to Relational
Model

Representation of Generalization/Specialization...
When the generalization/specialization is a disjointness case,
the schemas are constructed only for the specialized entity sets

Employee = (employee_id, name, address, salary)


Customer = (customer_id, name, address, credit_rating)

10.11
Conversion of ER
Conversion of ER model to Relational Model... model to Relational
Model

8. Representation of Aggregation Chittaranjan Pradhan

To represent aggregation, create a schema containing the


Conversion of ER
primary key of the aggregated relationship, primary key of the model to Relational
Model
associated entity set and descriptive attributes (if any)

Employee = (eid, name, address)


Branch = (bid, bname, asset)
Job = (jobid, position, responsibility)
Works_on = (eid, bid, jobid)
Manager = (mid, mgrname)
Manages = (eid, bid, jobid, mid) 10.12
Relational Algebra

Chittaranjan Pradhan

Database Management Query Language

Relational Algebra

System 11 SELECT Operator(σ)

PROJECT

Relational Algebra Operator(π)


Composition of Relational
Operators

RENAME Operator(ρ)

Union Compatibility

UNION Operator(∪)

DIFFERENCE
Operator(-)

Cartesian Product
Operator(×)

Intersection
Operator(∩)

JOIN Operator(o
n)

Division Operator(÷)

Assignment
Operator(←)

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
11.1
Relational Algebra
Query Language
Chittaranjan Pradhan

Query Language

Relational Algebra
Language in which user requests information from the SELECT Operator(σ)
database are: PROJECT
Operator(π)
• Procedural language Composition of Relational
Operators

• Nonprocedural language RENAME Operator(ρ)

Union Compatibility

UNION Operator(∪)

The categories of different languages are: DIFFERENCE


Operator(-)
• SQL Cartesian Product
Operator(×)
• Relational Algebra Intersection
Operator(∩)
• Relational Calculus
JOIN Operator(o
n)
• Tuple Relational Calculus
Division Operator(÷)
• Domain Relational Calculus
Assignment
Operator(←)

11.2
Relational Algebra
Relational Algebra
Chittaranjan Pradhan

Query Language

Relational Algebra
Relational Algebra
SELECT Operator(σ)
Relational algebra is a procedural language for manipulating PROJECT
Operator(π)
relations. Relational algebra operations manipulate relations. Composition of Relational
Operators
That is, these operations use one or two existing relations to
RENAME Operator(ρ)
create a new relation Union Compatibility

UNION Operator(∪)
• Fundamental operators DIFFERENCE
Operator(-)
• Unary: SELECT, PROJECT, RENAME
Cartesian Product
• Binary: UNION, SET DIFFERENCE, CARTESIAN Operator(×)
PRODUCT Intersection
Operator(∩)

JOIN Operator(o
n)
• Secondary operators Division Operator(÷)
• INTERSECTION, NATURAL JOIN, DIVISION, and Assignment
Operator(←)
ASSIGNMENT

11.3
Relational Algebra
Relational Algebra...
Chittaranjan Pradhan

Query Language
The relational schemas used for different operations are: Relational Algebra

SELECT Operator(σ)
• Customer(cust_name, cust_street, cust_city)
PROJECT
• used to store customer details Operator(π)
Composition of Relational
• Branch(branch_name, branch_city, assets) Operators

RENAME Operator(ρ)
• used to store branch details
Union Compatibility
• Account(acc_no, branch_name, balance) UNION Operator(∪)
• stores the account details DIFFERENCE
Operator(-)
• Loan(loan_no, branch_name, amount) Cartesian Product
Operator(×)
• stores the loan details
Intersection
• Depositor(cust_name, acc_no) Operator(∩)

JOIN Operator(o
n)
• stores the details about the customers’ account
Division Operator(÷)
• Borrower(cust_name, loan_no)
Assignment
• used to store the details about the customers’ loan Operator(←)

11.4
Relational Algebra
SELECT Operator(σ)
Chittaranjan Pradhan

SELECT Operator(σ)
Query Language
SELECT operation is used to create a relation from another
Relational Algebra
relation by selecting only those tuples or rows from the original SELECT Operator(σ)
relation that satisfy a specified condition. It is denoted by sigma PROJECT
(σ) symbol. The predicate appears as a subscript to σ Operator(π)
Composition of Relational
Operators

RENAME Operator(ρ)
The argument relation is in parenthesis after the σ. The result is
Union Compatibility
a relation that has the same attributes as the relation specified UNION Operator(∪)
in <relation-name>. The general syntax of select operator is: DIFFERENCE
σ <selection−condition> (<relation name>) Operator(-)

Cartesian Product
Operator(×)
Query: Find the details of the loans taken from ’Bhubaneswar Intersection
Main’ branch. Operator(∩)

JOIN Operator(o
n)
σ branch_name=0 BhubaneswarMain0 (Loan)
Division Operator(÷)
• The operators used in selection predicate may be: =, 6=, <, Assignment
≤, >, ≥. Operator(←)

• Different predicates can be combined into a larger


V W
predicate by using the connectors like: AND( ), OR( ),
NOT(¬)
11.5
Relational Algebra
SELECT Operator(σ)...
Chittaranjan Pradhan

Query Language

Relational Algebra
Loan SELECT Operator(σ)
loan_no branch_name amount
PROJECT
L201 Bhubaneswar Main 50,000,000.00 Operator(π)
L202 Bhubaneswar Main 5,000,000.00 Composition of Relational
Operators
L203 Mumbai Main 100,000,000.00 RENAME Operator(ρ)
L204 Juhu 60,000,000.00 Union Compatibility

UNION Operator(∪)
σ branch_name=0 BhubaneswarMain0 (Loan) DIFFERENCE
loan_no branch_name amount Operator(-)

L201 Bhubaneswar Main 50,000,000.00 Cartesian Product


Operator(×)
L202 Bhubaneswar Main 5,000,000.00 Intersection
Operator(∩)

σ branch_name=0 BhubaneswarMain0 ANDamount>10,000,000 (Loan) JOIN Operator(o


n)

loan_no branch_name amount Division Operator(÷)

Assignment
L201 Bhubaneswar Main 50,000,000.00 Operator(←)

11.6
Relational Algebra
PROJECT Operator(π)
Chittaranjan Pradhan

Query Language

Relational Algebra

SELECT Operator(σ)
PROJECT Operator(π)
PROJECT
Operator(π)
PROJECT operation can be thought of as eliminating Composition of Relational

unwanted columns. It eliminates the duplicate rows. It is Operators

RENAME Operator(ρ)
denoted by pie(π) symbol. The attributes needed to be
Union Compatibility
appeared in the resultant relation appear as subscript to π. UNION Operator(∪)

DIFFERENCE
The argument relation follows in parenthesis. The general Operator(-)

Cartesian Product
syntax of project operator is: Operator(×)

π <attribute−list> (<relation name>) Intersection


Operator(∩)

JOIN Operator(o
n)
Query: Find the loan numbers and respective loan amounts.
Division Operator(÷)
π loan_no,amount (Loan) Assignment
Operator(←)

11.7
Relational Algebra
PROJECT Operator(π)...
Chittaranjan Pradhan

Query Language

Relational Algebra

SELECT Operator(σ)
Loan PROJECT
loan_no branch_name amount Operator(π)
L201 Bhubaneswar Main 50,000,000.00 Composition of Relational
Operators
L202 Bhubaneswar Main 5,000,000.00 RENAME Operator(ρ)
L203 Mumbai Main 100,000,000.00 Union Compatibility
L204 Juhu 60,000,000.00 UNION Operator(∪)

DIFFERENCE
π loan_no,amount (Loan) Operator(-)

loan_no amount Cartesian Product


Operator(×)
L201 50,000,000.00
Intersection
L202 5,000,000.00 Operator(∩)

L203 100,000,000.00 JOIN Operator(o


n)
L204 60,000,000.00 Division Operator(÷)

Assignment
Operator(←)

11.8
Relational Algebra
Composition of Relational Operators
Chittaranjan Pradhan

Composition of Relational Operators


Query Language

Relational algebra operators can be composed together into a Relational Algebra

relational algebra expression to answer the complex queries SELECT Operator(σ)

PROJECT
Operator(π)
Q:Find the name of the customers who live in Bhubaneswar Composition of Relational
Operators

RENAME Operator(ρ)
Customer
Union Compatibility
cust_name cust_street cust_city
UNION Operator(∪)
Rishi India Gate New Delhi
DIFFERENCE
Sarthak M. G. Road Bangalore Operator(-)

Manas Shastri Nagar Bhubaneswar Cartesian Product


Operator(×)
Ramesh M. G. Road Bhubaneswar
Intersection
Mahesh Juhu Mumbai Operator(∩)

JOIN Operator(o
n)

π cust_name (σ cust_city =0 Bhubaneswar 0 (Customer)) Division Operator(÷)

Assignment
Operator(←)
cust_name
Manas
Ramesh

11.9
Relational Algebra
RENAME Operator(ρ)
Chittaranjan Pradhan

RENAME Operator(ρ) Query Language

Relational Algebra
The results of relational algebra expressions do not have a
SELECT Operator(σ)
name that can be used to refer them. It is useful to be able to PROJECT
give them names; the rename operator is used for this Operator(π)
Composition of Relational
purpose. It is denoted by rho(ρ) symbol. Operators

RENAME Operator(ρ)

Union Compatibility
The general syntax of rename operator is:
UNION Operator(∪)
ρ X (E)
DIFFERENCE
Operator(-)

Assume E is a relational-algebra expression with arity n. The Cartesian Product


Operator(×)
second form of rename operation is: ρ X (b1 ,b2 ,...bn ) (E) Intersection
Operator(∩)

JOIN Operator(o
n)
π cust_name (σ cust_city =0 Bhubaneswar 0 (Customer)) can be written as:
Division Operator(÷)
1. ρ Customer _Bhubaneswar (σ cust_city =0 Bhubaneswar 0 (Customer)) Assignment
Operator(←)

2. π cust_name (Customer_Bhubaneswar)

11.10
Relational Algebra
RENAME Operator(ρ)...
Chittaranjan Pradhan

Query Language

Relational Algebra
The different forms of the rename operation for renaming the SELECT Operator(σ)
relation are: PROJECT
Operator(π)
Composition of Relational

a. ρ S (R) Operators

RENAME Operator(ρ)

Union Compatibility
b. ρ S(b1 ,b2 ,...bn ) (R) UNION Operator(∪)

DIFFERENCE
c. ρ (b1 ,b2 ,...bn ) (R) Operator(-)

Cartesian Product
Operator(×)

For example, the attributes of Customer (cust_name, Intersection


Operator(∩)
cust_street, cust_city) can be renamed as:
JOIN Operator(o
n)

Division Operator(÷)
ρ (name,street,city ) (Customer) Assignment
Operator(←)

11.11
Relational Algebra
Union Compatibility
Chittaranjan Pradhan

Query Language

Relational Algebra

SELECT Operator(σ)
Union Compatibility PROJECT
Operator(π)
To perform the set operations such as UNION, DIFFERENCE Composition of Relational
Operators
and INTERSECTION, the relations need to be union RENAME Operator(ρ)
compatible for the result to be a valid relation Union Compatibility

UNION Operator(∪)

Two relations R1 (a1 ,a2 ,... an ) and R2 (b1 ,b2 ,... bm ) are union DIFFERENCE
Operator(-)
compatible iff: Cartesian Product
Operator(×)

Intersection
• n = m, i.e. both relations have same arity Operator(∩)

• dom(ai ) = dom(bi ) for 1 ≤ i ≤ n JOIN Operator(o


n)

Division Operator(÷)

Assignment
Operator(←)

11.12
Relational Algebra
UNION Operator(∪)
Chittaranjan Pradhan
UNION Operator(∪)
The union operation is used to combine data from two Query Language

relations. It is denoted by union(∪) symbol. The union of two Relational Algebra

SELECT Operator(σ)
relations R1 (a1 ,a2 ,... an ) and R2 (b1 ,b2 ,... bn ) is a relation R3
PROJECT
(c1 ,c2 ,... cn ) such that: Operator(π)

dom(ci ) = dom(ai ) ∪ dom(bi ), 1 ≤ i ≤ n Composition of Relational


Operators

R1 ∪ R2 is a relation that includes all tuples that are either RENAME Operator(ρ)

present in R1 or R2 or in both without duplicate tuples Union Compatibility

UNION Operator(∪)

DIFFERENCE
Depositor Borrower Operator(-)
cust_name acc_no cust_name loan_no Cartesian Product
Manas A101 Ramesh L201 Operator(×)

Ramesh A102 Ramesh L202 Intersection


Operator(∩)
Rishi A103 Mahesh L203
Mahesh A104 Rishi L204 JOIN Operator(o
n)

Mahesh A105 Division Operator(÷)

Assignment
π cust_name (Depositor) ∪ π cust_name (Borrower) Operator(←)

cust_name
Manas
Ramesh
Rishi
Mahesh
11.13
Relational Algebra
DIFFERENCE Operator(-)
Chittaranjan Pradhan

DIFFERENCE Operator(-)
Query Language
The difference operation is used to identify the rows that are in
Relational Algebra
one relation and not in another. It is denoted as (-) symbol. The SELECT Operator(σ)
difference of two relations R1 (a1 ,a2 ,... an ) and R2 (b1 ,b2 ,... bn ) is PROJECT
a relation R3 (c1 ,c2 ,... cn ) such that: Operator(π)
Composition of Relational
dom(ci ) = dom(ai ) - dom(bi ), 1 ≤ i ≤ n Operators

RENAME Operator(ρ)
R1 - R2 is a relation that includes all tuples that are in R1 , but
Union Compatibility
not in R2 UNION Operator(∪)

DIFFERENCE
Operator(-)
Depositor Borrower Cartesian Product
cust_name acc_no cust_name loan_no Operator(×)
Manas A101 Ramesh L201 Intersection
Ramesh A102 Ramesh L202 Operator(∩)

Rishi A103 Mahesh L203 JOIN Operator(o


n)
Mahesh A104 Rishi L204 Division Operator(÷)
Mahesh A105 Assignment
Operator(←)

π cust_name (Depositor) - π cust_name (Borrower)


cust_name
Manas
11.14
Relational Algebra
Cartesian Product Operator(×)
Chittaranjan Pradhan

Query Language

Relational Algebra
Cartesian Product Operator(×) SELECT Operator(σ)

The Cartesian product of two relations R1 (a1 ,a2 ,... an ) with PROJECT
Operator(π)
cardinality i and R2 (b1 ,b2 ,... bm ) with cardinality j is a relation Composition of Relational
Operators
R3 with RENAME Operator(ρ)

• degree k = n + m, Union Compatibility

UNION Operator(∪)
• cardinality i*j and
DIFFERENCE
Operator(-)
• attributes (a1 ,a2 ,... an , b1 ,b2 ,... bm ))
Cartesian Product
R1 × R2 is a relation that includes all the possible combinations Operator(×)

Intersection
of tuples from R1 and R2 . The Cartesian product is used to Operator(∩)
combine information from any two relations JOIN Operator(o
n)

It is not a useful operation by itself; but is used in conjuction Division Operator(÷)

with other operations Assignment


Operator(←)

11.15
Relational Algebra
Cartesian Product Operator(×)...
Chittaranjan Pradhan

Borrower Loan
cust_name loan_no loan_no branch_name amount Query Language

Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 Relational Algebra

Ramesh L202 L202 Bhubaneswar Main 5,000,000.00 SELECT Operator(σ)

Mahesh L203 L203 Mumbai Main 100,000,000.00 PROJECT


Operator(π)
Rishi L204 L204 Juhu 60,000,000.00 Composition of Relational
Operators

Borrower × Loan RENAME Operator(ρ)

cust_name Borrower.loan_no Loan.loan_no branch_name amount Union Compatibility


Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 UNION Operator(∪)
Ramesh L201 L202 Bhubaneswar Main 5,000,000.00
DIFFERENCE
Ramesh L201 L203 Mumbai Main 100,000,000.00 Operator(-)
Ramesh L201 L204 Juhu 60,000,000.00
Cartesian Product
Ramesh L202 L201 Bhubaneswar Main 50,000,000.00 Operator(×)
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00
Intersection
Ramesh L202 L203 Mumbai Main 100,000,000.00 Operator(∩)
Ramesh L202 L204 Juhu 60,000,000.00
JOIN Operator(o
n)
Mahesh L203 L201 Bhubaneswar Main 50,000,000.00
Mahesh L203 L202 Bhubaneswar Main 5,000,000.00 Division Operator(÷)
Mahesh L203 L203 Mumbai Main 100,000,000.00 Assignment
Mahesh L203 L204 Juhu 60,000,000.00 Operator(←)
Rishi L204 L201 Bhubaneswar Main 50,000,000.00
Rishi L204 L202 Bhubaneswar Main 5,000,000.00
Rishi L204 L203 Mumbai Main 100,000,000.00
Rishi L204 L204 Juhu 60,000,000.00

11.16
Relational Algebra
Cartesian Product Operator(×)...
Chittaranjan Pradhan

Query Language

Relational Algebra

SELECT Operator(σ)

PROJECT
Query: Find out the customer and their loan details taken from Operator(π)
Composition of Relational
Bhubaneswar Main branch. Operators

RENAME Operator(ρ)
Ans: σ branch_name=0 BhubaneswarMain0 ANDBorrower .loan_no=Loan.loan_no
Union Compatibility
(Borrower ×Loan)
UNION Operator(∪)

DIFFERENCE
Operator(-)
cust_name Borrower.loan_no Loan.loan_no branch_name amount
Cartesian Product
Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 Operator(×)
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00
Intersection
Operator(∩)

JOIN Operator(o
n)

Division Operator(÷)

Assignment
Operator(←)

11.17
Relational Algebra
Intersection Operator(∩)
Chittaranjan Pradhan
Intersection Operator(∩)
The intersection operation is used to identify the rows that are Query Language

common to two relations. It is denoted by (∩) symbol. The Relational Algebra

SELECT Operator(σ)
intersection of two relations R1 (a1 ,a2 ,... an ) and R2 (b1 ,b2 ,... bn )
PROJECT
is a relation R3 (c1 ,c2 ,... cn ) such that: Operator(π)

dom(ci ) = dom(ai ) ∩ dom(bi ), 1 ≤ i ≤ n Composition of Relational


Operators

R1 ∩ R2 is a relation that includes all tuples that are present in RENAME Operator(ρ)

both R1 and R2 Union Compatibility

The intersection operation can be rewritten by a pair of set UNION Operator(∪)

DIFFERENCE
difference operations as R ∩ S = R - (R - S) Operator(-)

Cartesian Product
Operator(×)
Depositor Borrower
Intersection
cust_name acc_no cust_name loan_no Operator(∩)
Manas A101 Ramesh L201 JOIN Operator(o
n)
Ramesh A102 Ramesh L202
Division Operator(÷)
Rishi A103 Mahesh L203
Assignment
Mahesh A104 Rishi L204 Operator(←)
Mahesh A105

π cust_name (Depositor) ∩ π cust_name (Borrower)


cust_name
Ramesh
Mahesh
11.18
Rishi
Relational Algebra
JOIN Operator(o
n)
Chittaranjan Pradhan

JOIN Operator(o
n)
Query Language
The join is a binary operation that is used to combine certain Relational Algebra
selections and a Cartesian product into one operation. It is SELECT Operator(σ)

denoted by join (on) symbol. PROJECT


Operator(π)
Composition of Relational
Operators
The join operation forms a Cartesian product of its two RENAME Operator(ρ)
arguments, performs a selection forcing equality on those Union Compatibility
attributes that appear in both relations, and finally removes the UNION Operator(∪)

duplicate attributes DIFFERENCE


Operator(-)

Cartesian Product
Query: Find the names of customers who have a loan at the Operator(×)

bank, along with the loan number and the loan amount. Intersection
Operator(∩)
Ans:This query can be solved by using the PROJECT,
JOIN Operator(o
n)
SELECT and CARTESIAN PRODUCT operators as: Division Operator(÷)
π cust_name,Loan.loan_no,amount (σ Borrower .loan_no=Loan.loan_no Assignment
(Borrower ×Loan)) Operator(←)

This same expression can be simplified by using the JOIN as:


π cust_name,loan_no,amount (Borrower o
n Loan))
11.19
Relational Algebra
JOIN Operator(o
n)...
Chittaranjan Pradhan

Query Language

Relational Algebra

SELECT Operator(σ)
Borrower Loan PROJECT
cust_name loan_no loan_no branch_name amount Operator(π)
Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 Composition of Relational
Operators
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00
RENAME Operator(ρ)
Mahesh L203 L203 Mumbai Main 100,000,000.00
Union Compatibility
Rishi L204 L204 Juhu 60,000,000.00
UNION Operator(∪)

DIFFERENCE
π cust_name,loan_no,amount (Borrower o
n Loan)) Operator(-)

cust_name loan_no amount Cartesian Product


Operator(×)
Ramesh L201 50,000,000.00
Intersection
Ramesh L202 5,000,000.00 Operator(∩)
Mahesh L203 100,000,000.00 JOIN Operator(o
n)
Rishi L204 60,000,000.00 Division Operator(÷)

Assignment
Operator(←)

11.20
Relational Algebra
Division Operator(÷)
Chittaranjan Pradhan

Division Operator(÷)
Query Language
The division operation creates a new relation by selecting the Relational Algebra
rows in one relation that match every row in another relation. SELECT Operator(σ)
The division operation requires that we look at an entire PROJECT
Operator(π)
relation at once. It is denoted by division (÷) symbol Composition of Relational
Operators

RENAME Operator(ρ)
Let A, B, C are three relations and we desire B ÷ C to give A
Union Compatibility
as the result. This operation is possible iff: UNION Operator(∪)

• The columns of C must be a subset of the columns of B. DIFFERENCE


Operator(-)
The columns of A are all and only those columns of B that Cartesian Product
are not columns of C Operator(×)

Intersection
• A row is placed in A if and only if it is associated with B Operator(∩)

and with every row of C JOIN Operator(o


n)

Division Operator(÷)
The division operation is the reverse of the Cartesian product Assignment
operation as: B = (B × C) ÷ C Operator(←)

Division operator is suited to queries that include the phrase


every or all as part of the condition
11.21
Relational Algebra
Division Operator(÷)...
Chittaranjan Pradhan

Depositor Account
cust_name acc_no acc_no branch_name balance Query Language

Manas A101 A101 Bhubaneswar Main 100,000.00 Relational Algebra

Ramesh A102 A102 Shastri Nagar 50,000.00 SELECT Operator(σ)

Rishi A103 A103 India Gate 5,000,000.00 PROJECT


Operator(π)
Mahesh A104 A104 Juhu 600,000.00 Composition of Relational
Mahesh A105 A105 Mumbai Main 10,000,000.00 Operators

RENAME Operator(ρ)

branch_name branch_city assets Union Compatibility

Bhubaneswar Main Bhubaneswar Gold UNION Operator(∪)


Shastri Nagar Bhubaneswar Mines
Branch India Gate New Delhi Gold
DIFFERENCE
Operator(-)
Juhu Mumbai Sea Shore Cartesian Product
Mumbai Main Mumbai Movie Operator(×)

Intersection
Operator(∩)
Query: Find all the customers who have an account at all the
JOIN Operator(o
n)
branches located in Mumbai Division Operator(÷)

Assignment
π cust_name,branch_name (Depositor o n Account) ÷ π branch_name (σ Operator(←)

branch_city =0 Mumbai 0 (Branch))


cust_name
Mahesh
11.22
Relational Algebra
Assignment Operator(←)
Chittaranjan Pradhan

Assignment Operator(←)
Query Language
It works like assignment in a programming language. In
Relational Algebra
relational algebra, the assignment operator gives a name to a SELECT Operator(σ)
relation. It is denoted by (←) symbol PROJECT
Assignment must always be made to a temporary relation Operator(π)
Composition of Relational

variable. The result of the right of the ← symbol is assigned to Operators

RENAME Operator(ρ)
the relation variable on the left of the ← symbol
Union Compatibility
With the assignment operator, a query can be written as a UNION Operator(∪)
sequential program consisting of: DIFFERENCE
Operator(-)
• a series of assignment,
Cartesian Product
Operator(×)
• followed by an expression whose value is displayed as a
Intersection
result of the query Operator(∩)

JOIN Operator(o
n)

π n Account) ÷ π branch_name (σ
cust_name,branch_name (Depositor o
Division Operator(÷)

0
branch_city = Mumbai 0 (Branch)) Assignment
Operator(←)
can be simplified as:
Temp1 ← π cust_name,branch_name (Depositor o n Account)
Temp2 ← π branch_name (σ branch_city =0 Mumbai 0 (Branch))
Result = Temp1 ÷ Temp2 11.23
JOIN

Chittaranjan Pradhan

Database Management Generalized Projection

Aggregate

System 12 Functions(g)

Join

JOIN Inner Join


Theta Join
Equi Join
Natural Join

Outer Join
Left Outer Join
Right Outer Join
Full Outer Join

Self Join

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
12.1
JOIN
Generalized Projection
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)
Generalized Projection
Join
The generalized-projection operation extends the projection Inner Join

operation by allowing arithmetic functions to be used in the Theta Join


Equi Join

projection list. The general form of generalized-projection is: Natural Join

Outer Join
Left Outer Join

π F1 ,F2 ...Fn (E) Right Outer Join


Full Outer Join

Self Join
Ex:Emp=(ssn, salary, deduction, years_service) be a relation.
A report may be required to show net_salary=salary-deduction,
bonus=2000*years_service and tax=0.25*salary

REPORT ← ρ (ssn,net_salary ,bonus,tax) (π


ssn,salary −deduction,2000∗years_service,0.25∗salary (Emp))

12.2
JOIN
Aggregate Functions(g)
Chittaranjan Pradhan

Aggregate Functions(g)
Aggregate functions take a collection of values and return a Generalized Projection

Aggregate
single value as a result. NULL value will not participate in the Functions(g)

aggregate functions. The general form of aggregate function is: Join

grouping_attribute g aggregate_functions (R) Inner Join


Theta Join
Equi Join

Let Works = (emp_id, ename, salary, branch_name) Natural Join

Outer Join
Query: Find the total sum of salaries of all the employees Left Outer Join

Ans: g SUM(salary ) (Works) Right Outer Join


Full Outer Join

Self Join
Query: Find the total sum of salaries of all the employees in
each branch
Ans: branch_name g SUM(salary ) (Works)

Query: Find the maximum salary for the employees at each


branch, in addition to the sum of the salaries
Ans: branch_name g SUM(salary ),MAX (salary ) (Works)

Query: Find the number of employees working


Ans: g COUNT (emp_id) (Works) 12.3
JOIN
Join
Chittaranjan Pradhan

Generalized Projection

Join Aggregate
Functions(g)

The join operation is used to connect data across relations. Join

Tables are joined on columns that have the same datatype and Inner Join
Theta Join
data width in the tables Equi Join
Natural Join

Outer Join
Join operation joins two relations by merging those tuples from Left Outer Join
Right Outer Join
two relations that satisfy a given condition. The condition is Full Outer Join

defined on attributes belonging to relations to be joined Self Join

Different categories of join are:


• Inner Join
• Outer Join
• Self Join

12.4
JOIN
Inner Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Join

Inner Join Inner Join


Theta Join
In the inner join, tuples with NULL valued join attributes do not Equi Join
Natural Join
appear in the result. Tuples with NULL values in the join Outer Join
attributes are also eliminated. The different types of inner join Left Outer Join
Right Outer Join
are: Full Outer Join

• Theta Join Self Join

• Equi Join
• Natural Join

12.5
JOIN
Theta Join(o
n θ)
Chittaranjan Pradhan

Generalized Projection

Theta Join(o
n θ) Aggregate
Functions(g)

The theta join is a join with a specified condition involving a Join

Inner Join
column from each relation. This condition specifies that the two Theta Join

columns should be compared in some way Equi Join


Natural Join

Outer Join
The comparison operator can be any of the six: <, ≤, >, ≥, = Left Outer Join
Right Outer Join
and 6= Full Outer Join

Self Join

Theta join is denoted by (o


n θ ) symbol. The general form of
theta join is:
Ron θ S = π all (σ θ (R × S))
• Degree (Result) = Degree (R) + Degree (S)
• Cardinality (Result) ≤ Cardinality(R) × Cardinality(S)

12.6
JOIN
Theta Join(o
n θ )...
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)
Account Loan Join
acc_no branch_name balance loan_no branch_name amount
A101 Bhubaneswar Main 100,000.00 L201 Bhubaneswar Main 50,000,000.00 Inner Join
A102 Shastri Nagar 50,000.00 L202 Bhubaneswar Main 5,000,000.00 Theta Join
A103 India Gate 5,000,000.00 L203 Mumbai Main 100,000,000.00 Equi Join
A104 Juhu 600,000.00 L204 Juhu 60,000,000.00 Natural Join

A105 Mumbai Main 10,000,000.00 Outer Join


Left Outer Join
Right Outer Join
Q: Find the account details as well as loan details for the Full Outer Join

situations where depositing balance is greater than or equal to Self Join

the borrowing amount

Account o
n balance≥amount Loan
acc_no branch_name balance loan_no branch_name amount
A103 India Gate 5,000,000.00 L202 Bhubaneswar Main 5,000,000.00
A105 Mumbai Main 10,000,000.00 L202 Bhubaneswar Main 5,000,000.00

12.7
JOIN
Equi Join(o
n =)
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Join
Equi Join(o
n =)
Inner Join
Theta Join
The equi join is the theta join based on equality of specified Equi Join

columns. That means the equi join is the special type of theta Natural Join

Outer Join
join where the comparison operator is = Left Outer Join
Right Outer Join
Full Outer Join
The general form of theta join is: Self Join
Ron = S = π all (σ = (R × S))
• Degree (Result) = Degree (R) + Degree (S)
• Cardinality (Result) ≤ Cardinality(R) × Cardinality(S)

12.8
JOIN
Equi Join(o
n = )...
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)
Borrower Loan Join
cust_name loan_no loan_no branch_name amount
Inner Join
Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 Theta Join
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00 Equi Join
Natural Join
Mahesh L203 L203 Mumbai Main 100,000,000.00
Outer Join
Rishi L204 L204 Juhu 60,000,000.00
Left Outer Join
Right Outer Join

Q: Find the customer name and their loan details Full Outer Join

Self Join

Borrower o
n Borrower .loan_no=Loan.loan_no Loan
cust_name Borrower.loan_no Loan.loan_no branch_name amount
Ramesh L201 L201 Bhubaneswar Main 50,000,000.00
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00
Mahesh L203 L203 Mumbai Main 100,000,000.00
Rishi L204 L204 Juhu 60,000,000.00

12.9
JOIN
Natural Join(o
n)
Chittaranjan Pradhan

Generalized Projection
Natural Join(o
n) Aggregate
Functions(g)
To perform natural join on two relations, they should contain at Join

least one common attributes. It is just like the equi join with the Inner Join
Theta Join
elimination of the common attributes. The natural join is Equi Join

denoted by (on) symbol Natural Join

Outer Join
Left Outer Join

The general form of theta join is: Right Outer Join


Full Outer Join
Ron S = π all−common_attributes (σ = (R × S)) Self Join

• Degree (Result) = Degree (R) + Degree (S) - Degree (R ∩


S)
• Cardinality (Result) ≤ Cardinality(R) × Cardinality(S)
The general form of the natural join can also be represented
as:
Ron S = π all (R o
n S)

12.10
JOIN
Natural Join(o
n)...
Chittaranjan Pradhan

Generalized Projection

Aggregate
Borrower Loan Functions(g)
cust_name loan_no loan_no branch_name amount Join
Ramesh L201 L201 Bhubaneswar Main 50,000,000.00 Inner Join
Ramesh L202 L202 Bhubaneswar Main 5,000,000.00 Theta Join
Equi Join
Mahesh L203 L203 Mumbai Main 100,000,000.00 Natural Join
Rishi L204 L204 Juhu 60,000,000.00 Outer Join
Left Outer Join
Right Outer Join
Q: Find the customer name and their loan details Full Outer Join

Self Join
Borrower o
n Loan
cust_name loan_no branch_name amount
Ramesh L201 Bhubaneswar Main 50,000,000.00
Ramesh L202 Bhubaneswar Main 5,000,000.00
Mahesh L203 Mumbai Main 100,000,000.00
Rishi L204 Juhu 60,000,000.00

12.11
JOIN
Outer Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Outer Join Functions(g)

It is an extension of the natural join operation to deal with the Join

Inner Join
missing information. The outer join consists of two steps: Theta Join
Equi Join
• First, a natural join is executed Natural Join

• Then if any record in one relation does not match a record Outer Join
Left Outer Join
from the other relation in the natural join, that unmatched Right Outer Join
Full Outer Join
record is added to the join relation, and the additional Self Join
columns are filled with NULLs
The different types of outer join are:
• Left Outer Join
• Right Outer Join
• Full Outer Join

12.12
JOIN
Left Outer Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Left Outer Join( Join

./ Inner Join
Theta Join

) Equi Join
Natural Join

The left outer join preserves all tuples in left relation. The left Outer Join
Left Outer Join
outer join is denoted by symbol: Right Outer Join
Full Outer Join

Self Join
./

All information from the left relation is present in the result of


the left outer join

12.13
JOIN
Left Outer Join...
Chittaranjan Pradhan

Customer Borrower
cust_name cust_street cust_city cust_name loan_no Generalized Projection

Rishi India Gate New Delhi Ramesh L201 Aggregate


Functions(g)
Sarthak M. G. Road Bangalore Ramesh L202
Join
Manas Shastri Nagar Bhubaneswar Mahesh L203
Inner Join
Ramesh M. G. Road Bhubaneswar Rishi L204 Theta Join
Mahesh Juhu Mumbai Equi Join
Natural Join

Outer Join
Q: Find out the customer details who have taken loans as well Left Outer Join

as who have not taken loans Right Outer Join


Full Outer Join

Self Join

Customer ./ Borrower
cust_name cust_street cust_city loan_no
Rishi India Gate New Delhi L204
Ramesh M. G. Road Bhubaneswar L201
Ramesh M. G. Road Bhubaneswar L202
Mahesh Juhu Mumbai L203
Sarthak M. G. Road Bangalore NULL
Manas Shastri Nagar Bhubaneswar NULL
12.14
JOIN
Right Outer Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Right Outer Join( Join

./ Inner Join
Theta Join

) Equi Join
Natural Join

The right outer join preserves all tuples in right relation. The Outer Join
Left Outer Join
right outer join is denoted by symbol: Right Outer Join
Full Outer Join

Self Join
./

All information from the right relation is present in the result of


the right outer join

12.15
JOIN
Right Outer Join...
Chittaranjan Pradhan

Borrower Customer
cust_name loan_no cust_name cust_street cust_city Generalized Projection

Ramesh L201 Rishi India Gate New Delhi Aggregate


Functions(g)
Ramesh L202 Sarthak M. G. Road Bangalore
Join
Mahesh L203 Manas Shastri Nagar Bhubaneswar
Inner Join
Rishi L204 Ramesh M. G. Road Bhubaneswar Theta Join
Mahesh Juhu Mumbai Equi Join
Natural Join

Outer Join
Q: Find out the customer details who have taken loans as well Left Outer Join

as who have not taken loans Right Outer Join


Full Outer Join

Self Join

Borrower ./ Customer
cust_name loan_no cust_street cust_city
Rishi L204 India Gate New Delhi
Ramesh L201 M. G. Road Bhubaneswar
Ramesh L202 M. G. Road Bhubaneswar
Mahesh L203 Juhu Mumbai
Sarthak NULL M. G. Road Bangalore
Manas NULL Shastri Nagar Bhubaneswar
12.16
JOIN
Full Outer Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Full Outer Join( Join

./ Inner Join
Theta Join

) Equi Join
Natural Join

The full outer join preserves all tuples in both relations. The full Outer Join
Left Outer Join
outer join is denoted by symbol: Right Outer Join
Full Outer Join

Self Join
./

All information from both the relations is present in the result of


the full outer join

12.17
JOIN
Self Join
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)

Join
Self Join
Inner Join
The self join is similar to the theta join. It joins a relation to itself Theta Join
Equi Join
by a condition. The self join can be viewed as a join of two Natural Join

copies of the same relation Outer Join


Left Outer Join
The general form of self join is: Right Outer Join
Full Outer Join
Ron θ R = π all (σ θ (R × R))
Self Join

Thus, the self join creates two alias or copies of the same
relation; then performs the theta join by a condition based on
the attributes of these two copies

12.18
JOIN
Self Join...
Chittaranjan Pradhan

Generalized Projection

Aggregate
Functions(g)
cust_name cust_street cust_city Join
Rishi India Gate New Delhi
Inner Join
Sarthak M. G. Road Bangalore
Customer Manas Shastri Nagar Bhubaneswar
Theta Join
Equi Join
Natural Join
Ramesh M. G. Road Bhubaneswar
Outer Join
Mahesh Juhu Mumbai Left Outer Join
Right Outer Join

Q: Find out the customer details as well as the others’ staying Full Outer Join

Self Join
in the same cust_city

C1 o
n C1.cust_city =C2.cust_city C2
C1.cust_name C1.cust_street C1.cust_city C2.cust_name C2.cust_street C2.cust_city
Manas Shastri Nagar Bhubaneswar Ramesh M. G. Road Bhubaneswar
Ramesh M. G. Road Bhubaneswar Manas Shastri Nagar Bhubaneswar

12.19
Query Using Relationa
Algebra

Chittaranjan Pradhan

Database Management Query Using


Relational Algebra

System 13
Query Using Relational Algebra

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
13.1
Query Using Relationa
Query Using Relational Algebra Algebra

Chittaranjan Pradhan

Employee Database Query Using


Relational Algebra
Emp(empNo,name)
Project(projectNo,pName,manager)
Assigned_To(projectNo,empNo)

Query: Find empNo of employees working on project ’comp01’


π empNo (σ projectNo=0 comp010 (Assigned_To))

Query: Find details of employees working on project ’comp01’


π empNo,name (σ projectNo=0 comp010 (Emp n Assigned_To))
o

Query: Obtain the details of employees working on the database


project
π empNo,name (σ pName=0 database0 (Emp n Assigned_To o
o n Project))

13.2
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query Using
Relational Algebra

Query: Find the details of employees working on the ’comp01’


and ’comp02’ projects
π empNo,name (σ projectNo=0 comp010 ∧projectNo=0 comp020 (Emp o
n
Assigned_To))

Query: Find the empNo who don’t work on project ’comp01’


π empNo (Assigned_To) - π empNo (σ
projectNo=0 comp010 (Assigned_To))

This query can be solved as:


π empNo (σ projectNo6=0 comp010 (Assigned_To))

13.3
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Sailor Database
Query Using
Sailors(sid, sname, rating, age) Relational Algebra

Boats(bid, bname, color)


Reserves(sid, bid, day)

Query: Find the names of sailors who’ve reserved boat 105


π sname (σ bid=105 (Reserves n Sailors))
o

This query can also be written as:


π sname (σ bid=105 (Reserves) o
n Sailors)

Query: Find the names of sailors who’ve reserved a green boat


π sname (σ color =0 green0 (Boats n Reserves o
o n Sailors))

This query can also be written as:


π sname ((σ color =0 green0 (Boats)) o
n Reserves o
n Sailors)

13.4
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query Using
Relational Algebra

Query: Find the sailor ids of the sailors who’ve reserved all boats
π sid,bid (Reserves) ÷ π bid (Boats)

Query: Find the names of sailors who’ve reserved all boats


1. ρ Temp (π sid,bid (Reserves) ÷ π bid (Boats))
2. π sname (Temp o n Sailors)

This query can also be written as:


n Reserves) ÷ π
π sname,bid (Sailors o bid (Boats)

13.5
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query Using
Shipment Database Relational Algebra

Customer(cust_id, cust_name, annual_revenue)


Truck(truckno, driver_name)
City(city_name, population)
Shipment(shipment_no, cust_id, weight, truckno,
destination_city)

Query: Find the list of shipment numbers for shipments weighing


over 20 pounds
π shipment_no (σ weight>20pound (Shipment))

Query: Find the names of customers with more than $10 million
in annual revenue
π cust_name (σ annual_revenue>$10million (Customer))

13.6
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query: Find the driver of truck 45


Query Using
Relational Algebra
π driver _name (σ truckno=45 (Truck))

Query: Find the names of cities which have received shipments


weighing over 100 pounds
π destination_city (σ weight>100pounds (Shipment))

Query: Find the name and annual revenue of customers who


have sent shipments weighing over 100 pounds
π cust_name,annual_revenue (σ weight>100pounds (Customer o
n
Shipment))

Query: Find the truck numbers of trucks which have carried


shipments weighing over 100 pounds
π truckno (σ weight>100pounds (Shipment))

13.7
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query: Find the names of drivers who have delivered shipments


weighing over 100 pounds Query Using
Relational Algebra

π driver _name (σ weight>100pounds (Shipment n Truck))


o

Query: List the cities which have received shipments from


customers having over $15 million in annual revenue
π destination_city (σ annual_revenue>$15million (Customer n Shipment))
o

Query: List the customers having over $5 million in annual


revenue who have sent shipments weighing greater than 1 pound
π cust_name (σ annual_revenue>$5million (Customer) nσ
o weight>1pound
(Shipment))

This query can also be written as:


π cust_name (σ annual_revenue>$5million∧weight>1pound (Customer o
n
Shipment))

13.8
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query: List the customers whose shipments have been delivered Query Using
by truck driver Ramesh Relational Algebra

π cust_name (σ driver _name=0 Ramesh0 (Customer n Shipment o


o n
Truck))

Query: Find the customers having over $5 million in annual


revenue who have sent shipments weighing less than 1 pound or
have sent a shipment to Bhubaneswar
π cust_name (σ annual_revenue>$5million (Customer)nσ
o
weight>1pound∨destination_city =0 Bhubaneswar 0 (Shipment))

Query: Find the customers who have sent shipments to every


city with population over 500000
π cust_name,destination_city n Shipment) ÷ π
(Customer o city (σ
population>500000 (City))

13.9
Query Using Relationa
Query Using Relational Algebra... Algebra

Chittaranjan Pradhan

Query: List the drivers who have delivered shipments for


customers with annual revenue over $20 million to cities with Query Using
Relational Algebra
population over 1 million
π driver _name (σ annual_revenue>20million (Customer) o
n Shipment o
n
Truck o n (σ population>1million (City)))

This query can also be written as:


π driver _name (σ annual_revenue>20million∧population>1million (Customer o
n
Shipment o n Truck o n City))

Query: Find the cities which have received shipments from every
customer
π destination_city ,cust_id (Shipment) ÷ π cust_id (Customer)

Query: Find the drivers who have delivered shipments to every


city
π driver _name,destination_city n Shipment) ÷ π
(Truck o city _name (City)

13.10
Relational Calculus

Chittaranjan Pradhan

Database Management Relational Calculus

Tuple Relational

System 14 Calculus (TRC)


Safe Expressions
Queries

Relational Calculus Domain Relational


Calculus (DRC)
Queries

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
14.1
Relational Calculus
Relational Calculus
Chittaranjan Pradhan

Relational Calculus

Tuple Relational
Relational Calculus Calculus (TRC)
Safe Expressions
Relational calculus is non-procedural Queries

Domain Relational
Calculus (DRC)
In relational calculus, a query is solved by defining a solution Queries

relation in a single step

Relational calculus is mainly based on the well-known


propositional calculus, which is a method of calculating with
sentences or declarations

Various types of relational calculus are:


• Tuple Relational Calculus (TRC)
• Domain Relational Calculus (DRC)

14.2
Relational Calculus
Tuple Relational Calculus (TRC)
Chittaranjan Pradhan

Relational Calculus

Tuple Relational
Calculus (TRC)
Tuple Relational Calculus (TRC) Safe Expressions
Queries
A tuple variable is a variable that takes on tuples of a particular Domain Relational
relation schema as values Calculus (DRC)
Queries

A tuple relational calculus query has the form:


{T/ P(T)}
The result of this query is the set of all tuples t for which the
formula P(T) evaluates to TRUE with T = t
Sailors (sid, sname, rating, age)

Query: Find all the sailors with a rating above 4


{S/S ∈ Sailors ∧ S.rating > 4}

14.3
Relational Calculus
Tuple Relational Calculus (TRC)...
Chittaranjan Pradhan

Tuple Relational Calculus (TRC)


Relational Calculus
Let Rel → be a relation name, R, S → be the tuple variables, a
Tuple Relational
→ an attribute of R, b → an attribute of S, op → operator in the Calculus (TRC)
Safe Expressions
set {<, ≤, >, ≥, =, 6=}. An Atomic formula is one of the following: Queries

Domain Relational
Calculus (DRC)
• R ∈ Rel Queries

• R.a op S.b
• R.a op Constant or Constant op R.a
To represent the join and division of relational algebra by
relational calculus, we need quantifiers such as: existential for
join and universal for division

A quantifier quantifies or indicates the quantity of something

The existential quantifier (∃) states that at least one instance of


a particular type of thing exist
Similarly, the universal quantifier(∀) states that some condition
applies to all or to every row of some type
14.4
Relational Calculus
Tuple Relational Calculus (TRC)...
Chittaranjan Pradhan

Relational Calculus
Tuple Relational Calculus (TRC) Tuple Relational
Calculus (TRC)
A formula is recursively defined by using the following rules: Safe Expressions
Queries

• Any atomic formula Domain Relational


Calculus (DRC)
• If p and q are formulae, then ¬ p, p ∧ q, p ∨ q, or p ⇒ q Queries

are also formulae


• If p is a formula that contains T as a variable, then ∃ T(p)
and ∀ T(p) are also formulae
The quantifiers ∃ and ∀ are said to bind the tuple variable R;
whereas a variable is said to be free in a formula if the formula
does not contain an occurrence of a quantifier that binds it

In most of the queries, the output is shown by using the free


variables

14.5
Relational Calculus
Safe Expressions
Chittaranjan Pradhan

Relational Calculus

Tuple Relational
Calculus (TRC)
Safe Expressions Safe Expressions
Queries

Whenever we use universal quantifiers or existential quantifiers Domain Relational


Calculus (DRC)
in a calculus expression, we must make sure that the resulting Queries

expression makes sense

A safe expression in relational calculus is one that is


guaranteed to yield a finite number of tuples as its result;
otherwise, the expression is called unsafe

That means, an expression is said to be safe if all values in its


result are from the domain of the expression

14.6
Relational Calculus
Queries
Chittaranjan Pradhan

Sailor Database
Relational Calculus
Sailors(sid, sname, rating, age)
Tuple Relational
Boats(bid, bname, color) Calculus (TRC)
Safe Expressions
Reserves(sid, bid, day) Queries

Domain Relational
Calculus (DRC)
Queries
Query: Find the names & ages of sailors with a rating above 4
{T/∃S ∈ Sailors (S.rating >4 ∧ T.sname=S.sname ∧ T.age=
S.age)}

Query: Find the sailor name, boat id & reservation date for each
reservation
{T/∃R ∈ Reserves ∃S ∈ Sailors (R.sid = S.sid ∧
T.sname=S.sname ∧ T.bid=R.bid ∧ T.day=R.day)}

Query: Find the names of sailors who have reserved boat 111
{T/∃R ∈ Reserves ∃S ∈ Sailors (R.sid = S.sid ∧ R.bid=111 ∧
T.sname=S.sname)}
14.7
Relational Calculus
Queries...
Chittaranjan Pradhan

Relational Calculus

Tuple Relational
Query: Find the names of sailors who have reserved a green boat Calculus (TRC)
Safe Expressions

{T/∃S ∈ Sailors ∃R ∈ Reserves(R.sid = S.sid ∧ Queries

T.sname=S.sname ∧ ∃B ∈ Boats (B.bid=R.bid ∧ Domain Relational


Calculus (DRC)
B.color=’green’))} Queries

This query can also be written as:


{T/∃S ∈ Sailors ∃R ∈ Reserves ∃B ∈ Boats(R.sid = S.sid ∧
B.bid=R.bid ∧ B.color=’green’ ∧ T.sname=S.sname)}

Query: Find the names of sailors who have reserved at least 2


boats
{T/∃S ∈ Sailors ∃R1 ∈ Reserves ∃R2 ∈ Reserves(S.sid=R1.sid
∧ R1.sid= R2.sid ∧ R1.bid 6= R2.bid ∧ T.sname=S.sname)}

14.8
Relational Calculus
Queries...
Chittaranjan Pradhan

Relational Calculus

Tuple Relational
Calculus (TRC)
Safe Expressions
Queries

Query: Find the names of sailors who have reserved all boats Domain Relational
Calculus (DRC)
Queries
{T/∃S ∈ Sailors ∀B ∈ Boats(∃R ∈ Reserves (S.sid=R.sid ∧
R.bid= B.bid ∧ T.sname= S.sname))}

Query: Find sailors who have reserved all green boats


{S/S ∈ Sailors ∧ ∀B ∈ Boats (B.color=’green’ ⇒(∃ R ∈
Reserves (S.sid=R.sid ∧ R.bid= B.bid))}

14.9
Relational Calculus
Domain Relational Calculus (DRC)
Chittaranjan Pradhan

Relational Calculus
Domain Relational Calculus (DRC)
Tuple Relational
In tuple relational calculus, the variables range over the tuples Calculus (TRC)
Safe Expressions

whereas in domain relational calculus, the variables range over Queries

the domains Domain Relational


Calculus (DRC)
Queries

The domain variables are the ones which range over the
underlying domains instead of over the relations

The domain relational calculus query has the form:


{<x1 , x2 , ... xn > | P(x1 , x2 , ... xn )}
where xi is either a domain variable or a constant and P(x1 , x2 ,
... xn ) is the domain relational calculus formula whose only free
variables are the variables among the xi, 1 ≤ i ≤ n

The result of this query is the set of all tuples <x1 , x2 , ... xn > for
which the formula evaluates to TRUE

14.10
Relational Calculus
Domain Relational Calculus (DRC)...
Chittaranjan Pradhan

Domain Relational Calculus (DRC)


Relational Calculus
Let Rel → be a relation name, X, Y → be the domain variables, Tuple Relational
Calculus (TRC)
op → an operator in the set {<, ≤, >, ≥, =, 6=}. An Atomic Safe Expressions

formula in domain relational calculus is one of the following: Queries

Domain Relational
• <x1 , x2 , ... xn > ∈ Rel Calculus (DRC)
Queries

• X op Y
• X op Constant or Constant op X
A formula is recursively defined by using the following rules:
• Any atomic formula
• If p and q are formulae, then ¬p, p ∧ q, p ∨ q, or p ⇒ q are
also formulae
• If p is a formula that contains X as a domain variable, then
∃X(p) and ∀ X(p) are also formulae
The quantifiers ∃ & ∀ are said to bind the domain variable X.
Whereas a variable is said to be free in a formula if the formula
does not contain an occurrence of a quantifier that binds it
14.11
Relational Calculus
Queries
Chittaranjan Pradhan

Sailor Database
Sailors(sid, sname, rating, age) Relational Calculus

Boats(bid, bname, color) Tuple Relational


Calculus (TRC)
Reserves(sid, bid, day) Safe Expressions
Queries

Domain Relational
Query: Find all sailors with a rating above 7 Calculus (DRC)
Queries

{<I, N, T, A> / <I, N,T, A> ∈ Sailors ∧ T >7}

Query: Find the names of sailors who reserved boat 111


{<N> /∃I, T , A (<I, N, T, A> ∈ Sailors ∧ ∃<Ir, Br, D> ∈ Reserves
(Ir=I ∧ Br=111))}

This query can also be written as:


{<N> /∃I, T , A (<I, N, T, A> ∈ Sailors ∧ ∃<Ir, Br, D> ∈ Reserves
(Ir=I ∧ Br=111))}
or
{<N> /∃I, T , A (<I, N, T, A> ∈ Sailors ∧ ∃D(<I,111, D> ∈
Reserves))}
14.12
Relational Calculus
Queries...
Chittaranjan Pradhan

Query: Find the names of sailors who have reserved a green boat
Relational Calculus

{<N>/∃I, T, A (<I, N, T, A> ∈ Sailors ∧ <I, Br, D> ∈ Reserves ∧ Tuple Relational
Calculus (TRC)
∃<Br, Bn, ’green’> ∈ Boats)} Safe Expressions
Queries

Domain Relational
Calculus (DRC)
Query: Find the names of sailors who have reserved at least 2 Queries
boats
{<N>/∃I, T, A (<I, N, T, A> ∈ Sailors ∧ ∃Br1, Br2, D1, D2(<I, Br1,
D1> ∈ Reserves ∧ <I, Br2, D2> ∈ Reserves ∧ Br1 6= Br2))}

Query: Find the names of sailors who have reserved all boats
{<N>/∃I, T, A(<I, N, T, A> ∈ Sailors ∧ ∀<B, Bn, C> ∈ Boats
(∃<Ir, Br, D> ∈ Reserves ( I=Ir ∧ Br=B)))}

Query: Find sailors who have reserved all green boats


{<I,N,T,A>/<I,N,T,A> ∈ Sailors ∧ ∀<B, Bn, C> ∈ Boats
(C=’green’ ⇒ ∃<Ir, Br, D> ∈ Reserves ( I=Ir ∧ Br=B)))}

14.13
Database Design

Chittaranjan Pradhan

Database Management Database Design

Bad Database

System 15 Design/Concept of
Anomalies

Functional

Database Design Dependency(FD)


Trivial FDs and Non-Trivial
FDs
Armstrong’s Inference
Axioms
Logical Implication
Closure of a Set of
Functional Dependencies
Closure of a Set of
Attributes
Redundancy of FDs
Canonical Cover/Minimal
Cover

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
15.1
Database Design
Database Design
Chittaranjan Pradhan
Database Design
Database Design
• First characterize fully the data requirements of the Bad Database
prospective database users, which usually involves in Design/Concept of
Anomalies
textual descriptions Functional
Dependency(FD)
• Next, choose ER model to translate these requirements Trivial FDs and Non-Trivial
FDs
into a conceptual schema of the database Armstrong’s Inference
Axioms

• In the logical design phase, map the high level conceptual Logical Implication
Closure of a Set of
schema onto the implementation data model of the Functional Dependencies
Closure of a Set of
database system that will be used. The implementation Attributes
Redundancy of FDs

data model is typically the Relational data model Canonical Cover/Minimal


Cover

• Finally, use the resulting system specific database schema


in the subsequent physical design phase, in which the
physical features of the database are specified
• In designing a database schema, the major pitfalls which
should be avoided are:
• redundancy: it means repetition of the information
• incompleteness: it means certain aspects of the enterprise
may not be modeled due to difficulty or complexity
15.2
Database Design
Bad Database Design/Concept of Anomalies
Chittaranjan Pradhan

Bad Database Design/Concept of Anomalies


Database Design
Database anomalies are the problems in relations that occur Bad Database
due to redundancy in the relations. These anomalies affect the Design/Concept of
Anomalies
process of inserting, deleting and updating data in the relations Functional
Dependency(FD)
Trivial FDs and Non-Trivial
The intension of relational database theory is to eliminate FDs
Armstrong’s Inference
anomalies from occurring in a database Axioms
Logical Implication
Closure of a Set of
Functional Dependencies
Closure of a Set of
Student database Attributes

Name Course Phone_no Major Prof Grade Redundancy of FDs


Canonical Cover/Minimal
Mahesh 353 1234 Comp sc Alok A Cover

Nitish 329 2435 Chemistry Pratap B


Mahesh 328 1234 Comp sc Samuel B
Harish 456 4665 Physics James A
Pranshu 293 4437 Decision sc Sachin C
Prateek 491 8788 Math Saurav B
Prateek 356 8788 Math Sunil In prog
Mahesh 492 1234 Comp sc Paresh In prog
Sumit 379 4575 English Rakesh C
15.3
Database Design
Bad Database Design/Concept of Anomalies...
Chittaranjan Pradhan

Insertion Anomaly Database Design

Bad Database
It is the anomaly in which the user cannot insert a fact about an Design/Concept of
Anomalies
entity until he/she has an additional fact about another entity. In Functional
other words, there are circumstances in which certain facts can Dependency(FD)
Trivial FDs and Non-Trivial
not be recorded at all. FDs
Armstrong’s Inference
Axioms
Logical Implication
Ex: We cannot record a new prof details without assigning a Closure of a Set of
Functional Dependencies

course to him Closure of a Set of


Attributes
Redundancy of FDs
Canonical Cover/Minimal
Deletion Anomaly Cover

It is the anomaly in which the deletion of facts about an entity


automatically deleted the fact of another entity.

Ex: If we want to delete the information about course 293,


automatically the information of prof Sachin will be deleted
which is not our interest

15.4
Database Design
Bad Database Design/Concept of Anomalies...
Chittaranjan Pradhan

Database Design

Bad Database
Design/Concept of
Anomalies
Updation Anomaly
Functional
Dependency(FD)
It is the anomaly in which the modification in the value of Trivial FDs and Non-Trivial

specific attribute requires modification in all records in which FDs


Armstrong’s Inference

that value occurs. In other words, the same data can be Axioms
Logical Implication

expressed in multiple rows. Therefore, updates to the table Closure of a Set of


Functional Dependencies

may result in logical inconsistencies. Closure of a Set of


Attributes
Redundancy of FDs
Canonical Cover/Minimal
Ex: If the updation to the phone_no of Mahesh is done in a Cover

single row only, then the updation process will put the database
in an inconsistent state so that the phone_no of Mahesh will
give conflicting answers

15.5
Database Design
Functional Dependency(FD)
Chittaranjan Pradhan

Functional Dependency(FD) Database Design

Bad Database
Functional Dependency is the building block of normalization Design/Concept of
Anomalies
principles
Functional
Dependency(FD)
Trivial FDs and Non-Trivial
Attribute(s) A in a relation schema R functionally determines FDs
Armstrong’s Inference
another attribute(s) B in R if for a given value a1 of A; there is a Axioms
Logical Implication
single, specific value b1 of B in relation r of R Closure of a Set of
Functional Dependencies
Closure of a Set of
Attributes
The symbolic expression of this FD is: Redundancy of FDs

A→B Canonical Cover/Minimal


Cover

where A(LHS of FD) is known as the determinant and B(RHS


of FD) is known as the dependent

In other words, if A functionally determines B in R, then it is


invalid to have two or more tuples that have the same A value,
but different B values in R

15.6
Database Design
Functional Dependency(FD)...
Chittaranjan Pradhan

Database Design
Functional Dependency(FD) Bad Database
Design/Concept of
From Student schema, we can infer that Name→Phone_no Anomalies

Functional
because all tuples of Student with a given Name value also Dependency(FD)
have the same Phone_no value Trivial FDs and Non-Trivial
FDs
Armstrong’s Inference
Axioms

Likewise, it can also be inferred that Prof→Grade. At the same Logical Implication
Closure of a Set of
time, notice that Grade does not determine Prof Functional Dependencies
Closure of a Set of
Attributes
Redundancy of FDs

When the determinant or dependent in an FD is a composite Canonical Cover/Minimal


Cover

attribute, the constituent atomic attributes are enclosed by


braces as shown in the following example:
{Name, Course}→Phone_no

FD is a constraint between two sets of attributes in a relation


from a database

15.7
Database Design
Trivial FDs and Non-Trivial FDs
Chittaranjan Pradhan

Trivial FDs
Database Design
A functional dependency X→Y is a trivial functional Bad Database
dependency if Y is a subset of X Design/Concept of
Anomalies

Functional
For example, {Name, Course}→Course. If two records have the Dependency(FD)
Trivial FDs and Non-Trivial
same values on both the Name and Course attributes, then FDs
Armstrong’s Inference

they obviously have the same Course Axioms


Logical Implication
Closure of a Set of
Functional Dependencies

Trivial dependencies hold for all relation instances Closure of a Set of


Attributes
Redundancy of FDs
Canonical Cover/Minimal
Cover
Non-Trivial FDs
A functional dependency X→Y is called as non-trivial type if
Y∩X =Φ

For example, Prof→Grade

Non-trivial FDs are given implicitly in the form of constraints


when designing a database
15.8
Database Design
Armstrong’s Inference Axioms
Chittaranjan Pradhan

Armstrong’s Inference Axioms


Database Design
The inference axioms or rules allow users to infer the FDs that Bad Database
Design/Concept of
are satisfied by a relation Anomalies

Functional
Dependency(FD)
Let R(X, Y, Z, W) where X, Y, Z, and W are arbitrary subsets of Trivial FDs and Non-Trivial
FDs
the set of attributes of a universal relation schema R Armstrong’s Inference
Axioms
Logical Implication
Closure of a Set of
The three fundamental inference rules are: Functional Dependencies
Closure of a Set of
• Reflexivity Rule: If Y is a subset of X, then X→Y (Trivial Attributes
Redundancy of FDs

FDs). Ex:{Name, Course}→Course Canonical Cover/Minimal


Cover

• Augmentation Rule: If X→Y, then {X, Z}→{Y, Z}. Ex: as


Prof→Grade, therefore {Prof, Major}→{Grade, Major}

• Transitivity Rule: If X→Y and Y→Z, then X→Z. Ex: as


Course→Name and Name→Phone_no functional
dependencies are present, therefore Course→Phone_no

15.9
Database Design
Armstrong’s Inference Axioms...
Chittaranjan Pradhan

Armstrong’s Inference Axioms


Database Design
The four secondary inference rules are:
Bad Database
• Union or Additive Rule: If X→Y and X→Z, then X→{Y, Design/Concept of
Anomalies
Z}. Ex:as Prof→Grade and Prof→Course FDs are present; Functional
therefore, Prof→{Grade, Course} Dependency(FD)
Trivial FDs and Non-Trivial
FDs
Armstrong’s Inference
Axioms
• Decomposition Rule: If X→{Y, Z}, then X→Y and X→Z. Logical Implication
Closure of a Set of
Ex: if Prof→{Grade, Course}, then this FD can be Functional Dependencies
Closure of a Set of
decomposed as Prof→Grade and Prof→Course Attributes
Redundancy of FDs
Canonical Cover/Minimal
Cover
• Composition Rule: If X→Y and Z→W, then {X, Z}→{Y,
W}. Ex: if Prof→Grade and Name→Phone_no, then the
FDs can be composed as {Prof, Name}→{Grade,
Phone_no}

• Pseudotransitivity Rule: If X→Y and {Y, W}→Z, then {X,


W}→Z. Ex: if Prof→Grade and {Grade, Major}→Course,
then the FD {Prof, Major}→Course is valid
15.10
Database Design
Logical Implication
Chittaranjan Pradhan

Database Design

Logical Implication Bad Database


Design/Concept of
Anomalies
Given a relation schema R and a set of functional
Functional
dependencies F. Let FD X→Y is not in F. F can be said to Dependency(FD)
Trivial FDs and Non-Trivial
logically imply X→Y if for every relation r on the relation FDs
Armstrong’s Inference
schema R that satisfies the FD in F, the relation r also satisfies Axioms
Logical Implication
X→Y Closure of a Set of
Functional Dependencies
Closure of a Set of
Attributes
F logically implies X→Y is written as: Redundancy of FDs

F|=X→Y Canonical Cover/Minimal


Cover

Let R = (A, B, C, D) and F = {A→B, A→C, BC→D}

F|=A→D

Given F = {A→B, C→D} with C⊆B, show that F|=A→D

15.11
Database Design
Closure of a Set of Functional Dependencies
Chittaranjan Pradhan

Closure of a Set of Functional Dependencies


Database Design
Given a set F of functional dependencies for a relation schema Bad Database

R, we define F + , the closure of F, to be the set of all functional Design/Concept of


Anomalies

dependencies that are logically implied by F. Mathematically, Functional


Dependency(FD)
Trivial FDs and Non-Trivial
FDs
F + ={X→Y/F|=X→Y} Armstrong’s Inference
Axioms
Logical Implication
Closure of a Set of
To generate all FDs that can be derived from F, the steps are: Functional Dependencies
Closure of a Set of
• First, apply the inference axioms to all single attributes and Attributes
Redundancy of FDs
use the FDs of F whenever it is applicable Canonical Cover/Minimal
Cover

• Second, apply the inference axioms to all combinations of


two attributes and use the functional dependencies of F
whenever it is applicable
• Next apply the inference axioms to all combinations of
three attributes and use the FDs of F when necessary
• Proceed in this manner for as many different attributes as
there are in F
15.12
Database Design
Closure of a Set of Functional Dependencies...
Chittaranjan Pradhan

Database Design

Let R=(A, B, C) and F={A→B, A→C} Bad Database


Design/Concept of
Anomalies

F + = {A→A, B→B, C→C, A→B, A→C, A→BC, A→AB, A→AC, Functional


Dependency(FD)
AB→A, AB→B, AB→AB, AC→A, AC→C, AC→AC, BC→B, Trivial FDs and Non-Trivial
FDs
BC→C, BC→BC, ABC→AB, ABC→AC, ABC→BC, Armstrong’s Inference
Axioms
ABC→ABC} Logical Implication
Closure of a Set of
Functional Dependencies
Closure of a Set of
Let R=(W, X, Y) and F={W→X, X→Y, W→XY} Attributes
Redundancy of FDs
Canonical Cover/Minimal
Cover
Uses of set of Functional Dependency Closure:
• Computing if two sets of functional dependencies F
and G are equivalent: When F + =G+ , then the functional
dependencies sets F and G are equivalent
F={W→X, X→Y, W→XY} and G= {W→X, W→Y, X→Y}

15.13
Database Design
Closure of a Set of Attributes
Chittaranjan Pradhan

Closure of a Set of Attributes Database Design

Given a set of attributes X and a set of functional Bad Database


Design/Concept of
dependencies F, then the closure of the set of attributes X Anomalies

under F, denoted as X + , is the set of attributes A that can be Functional


Dependency(FD)
derived from X by applying the Armstrong’s Inference Axioms Trivial FDs and Non-Trivial
FDs

to the functional dependencies of F Armstrong’s Inference


Axioms
Logical Implication
Closure of a Set of
The closure of X is always a non empty set Functional Dependencies
Closure of a Set of
Attributes
Redundancy of FDs
R = (A, B, C, D) and F = {A→C, B→D} Canonical Cover/Minimal
Cover

{A}+ = {A, C}, {B}+ = {B, D}, {C}+ ={C}, {D}+ ={D},
{A, B}+ = {A, B, C, D}, {A, C}+ = {A, C}, {A, D}+ = {A, C, D},
{B, C}+ = {B, C, D}, {B, D}+ = {B, D}, {C, D}+ = {C, D},
{A, B, C}+ = {A, B, C, D}, {A, B, D}+ = {A, B, C, D},
{B, C, D}+ = {B, C, D}, {A, B, C, D}+ = {A, B, C, D}

R = (X, Y, Z)and F = {X→Y, Y→Z}


15.14
Database Design
Closure of a Set of Attributes...
Chittaranjan Pradhan

Database Design

Bad Database
Design/Concept of
Anomalies
Uses of Attribute Closure:
Functional
Dependency(FD)
• Testing for key: To test whether X is a key or not, X + is Trivial FDs and Non-Trivial
FDs

computed. X is a key iff X + contains all the attributes of R. Armstrong’s Inference


Axioms

X is a candidate key if none of its subsets is a key Logical Implication


Closure of a Set of
Functional Dependencies
• Testing functional dependencies: To check whether a Closure of a Set of
Attributes
functional dependency X→Y holds or not, just check if Y⊆ Redundancy of FDs

X+ Canonical Cover/Minimal
Cover

Given R=(A, B, C, D) and F={AB→C, B→D, D→B}, find the


candidate keys of the relation. How many candidate keys are in
this relation?

15.15
Database Design
Redundancy of FDs
Chittaranjan Pradhan
Redundancy of FDs
Given a set of functional dependencies F, a functional Database Design

dependency A→B of F is said to be redundant with respect to Bad Database


Design/Concept of
the FDs of F if and only if A→B can be derived from the set of Anomalies

Functional
FDs F - {A→B} Dependency(FD)
Eliminating redundant functional dependencies allows us to Trivial FDs and Non-Trivial
FDs

minimize the set of FDs Armstrong’s Inference


Axioms

Ex: A→C is redundant in {A→B, B→C, A→C} Logical Implication


Closure of a Set of
Functional Dependencies
Closure of a Set of
Redundant attribute on RHS Attributes
Redundancy of FDs

In a functional dependency, some attributes in the RHS may be Canonical Cover/Minimal


Cover

redundant
Ex: F= {A→B, B→C, A→{C, D}} can be simplified to {A→B,
B→C, A→D}

Redundant attribute on LHS


In a functional dependency, some attributes in the LHS may be
redundant
Ex: F = {A→B, B→C, {A, C}→D} can be simplified to {A→B,
B→C, A→D} 15.16
Database Design
Canonical Cover/Minimal Cover
Chittaranjan Pradhan

Canonical Cover/Minimal Cover Database Design

Bad Database
For a given set F of FDs, a canonical cover, denoted by Fc , is a Design/Concept of
Anomalies
set of FDs where the following conditions are satisfied: Functional
Dependency(FD)
• F and Fc are equivalent Trivial FDs and Non-Trivial
FDs
• Every FD of Fc is simple. That is, the RHS of every Armstrong’s Inference
Axioms
functional dependency of Fc has only one attribute Logical Implication
Closure of a Set of

• No FD in Fc is redundant Functional Dependencies


Closure of a Set of
Attributes
• The determinant or LHS of every FD in Fc is irreducible Redundancy of FDs
Canonical Cover/Minimal
Cover

R = (A, B, C) and F = {A→{B, C}, B→C, A→B, {A, B}→C}

Fc = {A→B, B→C}

As canonical cover contains the functional dependencies


without any redundancy, therefore finding the key of the
relation becomes efficient

15.17
Normalization

Chittaranjan Pradhan

Database Management Normalization


Lossless Decomposition

System 16 Dependency Preservation

Guidelines followed in
Designing Good
Database
Normalization

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
16.1
Normalization
Normalization
Chittaranjan Pradhan

Normalization
Normalization
Normalization is the process of decomposing or breaking a Lossless Decomposition

relation schema R into fragments (i.e. smaller schemas) R1 , Dependency Preservation

Guidelines followed in
R2 , ... Rn such that the following conditions hold: Designing Good
Database
• Lossless decomposition: The fragments should contain
the same information as the original relation
• Dependency preservation: All the functional
dependencies should be preserved within each fragment
Ri
• Good form: Each fragment Ri should be free from any
type of redundancy
In other words, normalization is the process of refining the
relational data model. It is used because of the following
reasons:
• It improves database design
• It ensures minimum redundancy of data
• It removes anomalies for database activities
16.2
Normalization
Lossless Decomposition
Chittaranjan Pradhan
Lossless Decomposition
The decomposition of a base relation is said to be lossless if Normalization
Lossless Decomposition
the original relation can be recovered back by joining the Dependency Preservation

fragment relations Guidelines followed in


Designing Good
Let R be the base relation, which is decomposed into R1 , R2 , ... Database

Rn . This decomposition is lossless iff R=R1 o


n R2 on... o
n Rn . In
other words, the consecutive fragments are interrelated by the
primary key and foreign key relationships
A B
a 1
R a 2
b 1
b 2

A B
R1 R2 a 1
b 2

A B
a 1
R1 o
n R2 a 2
b 1
b 2

As R1 o
n R2 = R, the decomposition is lossless 16.3
Normalization
Lossless Decomposition...
Chittaranjan Pradhan

Normalization
A B Lossless Decomposition
a 1
R a 2
Dependency Preservation

Guidelines followed in
b 1 Designing Good
Database

A
R1 a
b

B
R2 1
2

A B
a 1
R1 o
n R2 a 2
b 1
b 2

n R2 6= R, the decomposition is lossy


As R1 o

16.4
Normalization
Lossless Decomposition...
Chittaranjan Pradhan
Lossless Decomposition...
When the base relation schema is decomposed into the Normalization
Lossless Decomposition
fragmented relation schemas, the consecutive relations should Dependency Preservation

be related by primary key - foreign key pair on the common Guidelines followed in
Designing Good
column; so that natural join be possible on the common Database

column. Thus, we have to check whether the common column


is the key of any relation or not:
• If the common column is the key, the decomposition is
lossless
• If the common column is not the key, the decomposition is
lossy

Ex: R(A, B, C, D) with F={A→B, B→C, C→D} is decomposed


to R1 (A, B, C) and R2 (C, D) with F1 ={A→B, B→C} and
F2 ={C→D} respectively

Ex: R(A, B, C, D) with F={A→B, B→C, D→C} is decomposed


to R1 (A, B, C) and R2 (C, D) with F1 ={A→B, B→C} and
F2 ={D→C} respectively
16.5
Normalization
Dependency Preservation
Chittaranjan Pradhan

Dependency Preservation Normalization


Lossless Decomposition
Dependency Preservation
The decomposition of a relation schema R with FDs F is a set
Guidelines followed in
of fragment relations Ri with FDs Fi , where Fi is the subset of Designing Good
Database
dependencies in F + that include only attributes in Ri . The
decomposition is dependency preserving if and only if

(∪i Fi )+ = F +

R = (A, B, C) and F={A→B, B→C, A→C}

A B C
1 2 3
R 2 2 3
3 2 3
4 3 4

Key=A

16.6
Normalization
Dependency Preservation...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
A B Dependency Preservation
1 2
Guidelines followed in
R1 = (A, B) and F={A→B} 2 2 Designing Good
3 2 Database
4 3

A C
1 3
R2 = (A, C) and F={A→C} 2 3
3 3
4 4

The decomposition is lossless because R1 o


n R2 = R

But, the decomposition is not dependency preserving because


(F1 ∪ F2 )+ ={A→B, A→C}

Thus, (F1 ∪ F2 )+ 6= F + as we lost the FD B→C

16.7
Normalization
Dependency Preservation...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
Dependency Preservation
A B
Guidelines followed in
1 2 Designing Good
R1 = (A, B) and F={A→B} 2 2 Database
3 2
4 3

B C
R2 = (B, C) and F={B→C} 2 3
3 4

The decomposition is lossless because R1 o


n R2 = R

Similarly, the decomposition is dependency preserving


because (F1 ∪ F2 )+ ={A→B, B→C, A→C}

Thus, (F1 ∪ F2 )+ = F +

16.8
Normalization
Guidelines followed in Designing Good Database
Chittaranjan Pradhan

Guideline 1
Normalization
Design a relation schema so that it is easy to explain its Lossless Decomposition
Dependency Preservation
meaning. Do not combine attributes from multiple entity sets Guidelines followed in
and relationship sets into a single relation Designing Good
Database

If a relation schema corresponds to one entity set or one


relationship set, then the meaning tends to be simple and
clear. Otherwise, the relation corresponds to a mixture of
multiple entities and relationships and hence becomes
complex and unclear

Only foreign keys should be used to refer to other entities.


Entity and relationship attributes should be kept apart as much
as possible

Bottom Line: design the schemas that can be explained easily


relation by relation. In such cases, the semantics of attributes
should be easy to interpret
16.9
Normalization
Guidelines followed in Designing Good Database...
Chittaranjan Pradhan

Guideline 2
Design the base relation schemas in such a way that the Normalization
Lossless Decomposition
anomalies such as insertion, deletion, or updation anomalies Dependency Preservation

are removed from the relations Guidelines followed in


Designing Good
Database

If any anomalies are present, note them clearly and make sure
that the programs that modify (update) the database will
operate correctly

Guideline 3
Avoid placing attributes in a base relation whose values may
frequently be NULL

If NULLs are unavoidable, make sure that they apply in


exceptional cases only and do not apply to a majority of tuples
in the relation

Attributes that are NULL frequently could be placed in separate


relations (with the primary key)
16.10
Normalization
Guidelines followed in Designing Good Database...
Chittaranjan Pradhan

Normalization
Lossless Decomposition
Dependency Preservation

Guidelines followed in
Designing Good
Database
Guideline 4
Design the relation schemas so that they can be joined in a
such a way that no spurious tuples are generated

Avoid relations that contain matching attributes that are not


(foreign key and primary key) combinations, because joining on
such attributes may produce spurious tuples

16.11
Normal Forms

Chittaranjan Pradhan

Database Management Normal Forms

1NF(First Normal

System 17 Form)

Partial FD

Normal Forms 2NF(Second Normal


Form)

Transitive FD

3NF(Third Normal
Form)

BCNF (Boyce-Codd
Normal Form)

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
17.1
Normal Forms
Normal Forms
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)

Partial FD

2NF(Second Normal
Normal Forms Form)

Transitive FD
• Normal forms provide a stepwise progression towards the
3NF(Third Normal
construction of normalized relation schemas, which are Form)

free from data redundancies BCNF (Boyce-Codd


Normal Form)

• A relation schema is said to be in a particular normal form


if it is satisfying certain defined conditions

17.2
Normal Forms
1NF(First Normal Form)
Chittaranjan Pradhan

Normal Forms
1NF(First Normal Form) 1NF(First Normal
Form)
A relation is in 1NF iff the values in the relation are atomic and Partial FD

single-valued for every attribute in the relation 2NF(Second Normal


Form)

Transitive FD
Module Dept Lecturer Text 3NF(Third Normal
M1 D1 L1 T1 ,T2 Form)

M2 D1 L1 T1 ,T3 BCNF (Boyce-Codd


Course Normal Form)
M3 D1 L2 T4
M4 D2 L3 T1 ,T5
M5 D2 L4 T6

• As the Text column values are not atomic, this relation is


not present in 1NF
• To convert this non-1NF relation into a 1NF relation, split
up the non-atomic values

17.3
Normal Forms
1NF(First Normal Form)...
Chittaranjan Pradhan
Module Dept Lecturer Text
M1 D1 L1 T1
Normal Forms
M1 D1 L1 T2
1NF(First Normal
M2 D1 L1 T1 Form)
Course1 M2 D1 L1 T3 Partial FD
M3 D1 L2 T4 2NF(Second Normal
Form)
M4 D2 L3 T1
Transitive FD
M4 D2 L3 T5
3NF(Third Normal
M5 D2 L4 T6 Form)

BCNF (Boyce-Codd
Module Dept Lecturer Text1 Text2 Normal Form)
M1 D1 L1 T1 T2
M2 D1 L1 T1 T3
Course2
M3 D1 L2 T4
M4 D2 L3 T1 T5
M5 D2 L4 T6

Corollary: As the relation schema contains no data values,


therefore all relation schemas are in 1NF
Anomalies in 1NF Relations:
• Insertion anomalies
• Updation anomalies
• Deletion anomalies 17.4
Normal Forms
Partial FD
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)

Partial FD
Partial FD 2NF(Second Normal
Form)
A FD A → B is a partial FD, if some attribute of A can be Transitive FD
removed and the FD still holds. That means there is some 3NF(Third Normal
proper subset of A, C ⊂ A, such that C → B Form)

BCNF (Boyce-Codd
Normal Form)

• Key attributes: are the attributes which are part of some


candidate key
• Non-key attributes: are the attributes which are not part
of any candidate key

17.5
Normal Forms
2NF(Second Normal Form)
Chittaranjan Pradhan
2NF(Second Normal Form)
A relation is in 2NF iff the following two conditions are met Normal Forms

simultaneously: 1NF(First Normal


Form)

• It is in 1NF Partial FD

2NF(Second Normal
• No non-key attribute is partially dependent on any key Form)

Transitive FD

A non-2NF relation can be decomposed into 2NF relations by 3NF(Third Normal


Form)
following: BCNF (Boyce-Codd
• Create a new relation by using the attributes from the Normal Form)

offending FD as the attributes in the new relation. The


determinant of the FD becomes the primary key of the
new relation
• The attribute on the RHS of the FD is then eliminated from
the original relation
• If more than one FD prevents the relation from being 2NF,
repeat steps 1 and 2 for each offending FD
• If the same determinant appears in more than one FD,
place all the attributes functionally dependent on this
determinant as non-key attributes in the relation having the
determinant as the primary key 17.6
Normal Forms
2NF(Second Normal Form)...
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)

Module Dept Lecturer Text Partial FD

M1 D1 L1 T1 2NF(Second Normal
Form)
M1 D1 L1 T2
Transitive FD
M2 D1 L1 T1
3NF(Third Normal
Course M2 D1 L1 T3 Form)
M3 D1 L2 T4 BCNF (Boyce-Codd
Normal Form)
M4 D2 L3 T1
M4 D2 L3 T5
M5 D2 L4 T6

F={Module→Dept, Module→Lecturer, Lecturer→Dept,


{Module, Text}→{Dept, Lecturer}}
Here, Key:{Module, Text}

17.7
Normal Forms
2NF(Second Normal Form)...
Chittaranjan Pradhan

Normal Forms

Module Dept Lecturer 1NF(First Normal


Form)
M1 D1 L1
M2 D1 L1 Partial FD
Course1 M3 D1 L2 2NF(Second Normal
M4 D2 L3 Form)
M5 D2 L4 Transitive FD

3NF(Third Normal
F1 ={Module→{Dept, Lecturer}, Lecturer→Dept} Form)

BCNF (Boyce-Codd
Module Text Normal Form)

M1 T1
M1 T2
M2 T1
Course2 M2 T3
M3 T4
M4 T1
M4 T5
M5 T6

F2 ={{Module, Text}}→{{Module, Text}}

17.8
Normal Forms
2NF(Second Normal Form)...
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)
Corollary: If the primary key has a single attribute, then the Partial FD
relation is in 2NF 2NF(Second Normal
Form)

Transitive FD
Anomalies in 2NF Relations:
3NF(Third Normal
• Insertion anomalies Form)

BCNF (Boyce-Codd
• Updation anomalies Normal Form)

• Deletion anomalies
Q: R=(A, B, C, D, E), & F={A → {B, C, D, E}, {A, B} → {C, D, E},
C → E, D → E}

Q: R=(A, B, C, D, E), & F={{A, B} → {C, D, E}, B → C, A → D}

17.9
Normal Forms
Transitive FD
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)

Partial FD

2NF(Second Normal
Form)

Transitive FD Transitive FD

3NF(Third Normal
A FD A → C is a transitive FD, if there are some set of Form)

attributes B such that A → B and B → C are non-trivial FDs BCNF (Boyce-Codd


Normal Form)

A → B non-trivial means B is not a subset of A

17.10
Normal Forms
3NF(Third Normal Form)
Chittaranjan Pradhan

3NF(Third Normal Form)


Normal Forms

A relation is in 3NF iff the following two conditions are satisfied 1NF(First Normal
Form)
simultaneously:
Partial FD
• It is in 2NF 2NF(Second Normal
Form)
• No non-key attribute is transitively dependent on the key Transitive FD

The process of decomposing the non-3NF relation into 3NF 3NF(Third Normal
Form)
relations is similar to the process of decomposing the non-2NF BCNF (Boyce-Codd
relation to 2NF relations Normal Form)

Module Dept Lecturer


M1 D1 L1
M2 D1 L1
Course M3 D1 L2
M4 D2 L3
M5 D2 L4

F={Module→{Dept, Lecturer}, Lecturer→Dept}

This relation is not present in 3NF because Module → Lecturer


and Lecturer → Dept
17.11
Normal Forms
3NF(Third Normal Form)...
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Module Lecturer Form)
M1 L1
M2 L1 Partial FD
Course1 M3 L2 2NF(Second Normal
M4 L3 Form)
M5 L4
Transitive FD

3NF(Third Normal
F1 ={Module→Lecturer} Form)

BCNF (Boyce-Codd
Lecturer Dept Normal Form)
L1 D1
Course2 L2 D1
L3 D2
L4 D2

F2 ={Lecturer→Dept}

Corollary: A 2NF relation is in 3NF if no non-key attribute


functionally determines any other non-key attribute

17.12
Normal Forms
3NF(Third Normal Form)...
Chittaranjan Pradhan
The 3NF helped us to get rid of the anomalies caused by
dependencies of a non-key attribute on another non-key
Normal Forms
attribute
1NF(First Normal
Form)

However, relations in 3NF are still susceptible to anomalies Partial FD

when the relations have two overlapping candidate keys or 2NF(Second Normal
Form)
when non-key attribute functionally determines a key attribute. Transitive FD

Overlapping candidate keys means composite candidate keys 3NF(Third Normal


Form)
with at least one attribute in common among themselves
BCNF (Boyce-Codd
Normal Form)

Note: A database should normally be in 3NF at least

Q: Lecturer = (lectid, lectname, courseid, coursename) &


F={lectid → lectname, lectid → courseid, lectid → coursename,
courseid → coursename}

Q: R=(B, C, E), F= {E → B, {B,C} → E}

Q: Store = (order, product, customer, address, qty, unitprice) &


F= {order → {customer, address}, customer → address,
product → unitprice, {order, product} → qty} 17.13
Normal Forms
BCNF (Boyce-Codd Normal Form)
Chittaranjan Pradhan
BCNF (Boyce-Codd Normal Form)
A relation is in BCNF iff the following two conditions are Normal Forms

satisfied simultaneously: 1NF(First Normal


Form)

• It is in 3NF Partial FD

2NF(Second Normal
• If for every non-trivial functional dependency, the Form)

determinant is a key Transitive FD

3NF(Third Normal
The process of decomposing the non-BCNF relation into BCNF Form)

relations is a simple process. For each non-trivial FD where the BCNF (Boyce-Codd
Normal Form)
determinant is not the key, construct new relations

Student Course Time


Rahul Database 12 : 00
Pratik Database 12 : 00
Student Praveen Database 15 : 00
Praveen Programming 10 : 00
Rajib Programming 10 : 00
Shivam Programming 13 : 00

F={{Student, Course} → Time, Time → Course, {Student,


Time} → Course}
Key={Student, Course} and {Student, Time} 17.14
Normal Forms
BCNF (Boyce-Codd Normal Form)...
Chittaranjan Pradhan

Normal Forms

1NF(First Normal
Form)
This relation is not present in BCNF as in FD Time → Course; Partial FD
the determinant {Time} is not a key 2NF(Second Normal
Form)

Transitive FD
After the conversion of this relation to BCNF, create a new
3NF(Third Normal
relation R1 =(Time, Course) with set of FDs F1 ={Time → Form)

Course} BCNF (Boyce-Codd


Normal Form)

The original relation is changed to R=(Student, Time) as


{Student, Time} set is also the key of the relation

Here, we have lost the functional dependency {Student,


Course} → Time

17.15
Normal Forms
BCNF (Boyce-Codd Normal Form)...
Chittaranjan Pradhan
Corollary: If a relation has only one candidate key, then
3NF and BCNF are same. That means if a relation is in 3NF
Normal Forms
having only one candidate key, then it is also present in
1NF(First Normal
BCNF Form)

Partial FD

Note: Normalization to 3NF is always lossless and 2NF(Second Normal


Form)
dependency preserving. But, normalization to BCNF is Transitive FD

lossless, but may not preserve all the functional 3NF(Third Normal
Form)
dependencies
BCNF (Boyce-Codd
Normal Form)

Q: R =(A, B, C, D, E), F={A → {B, E}, C → D}. Decompose the


relation to BCNF

Q: R=(A, B, C, D), F={{A, B} → {C, D}, C → B}. Decompose the


relation into BCNF

Q: R =(A, B, C, D, E, G), F={{A, B} → {C, D}, {B, C} → {D, A}, C


→ G, B → E}. Decompose this relation to BCNF

Q: R =(A, B, C, D) & F= {{A, C} → {B, D}, {B, C} → {D, A}, A →


B, B → A}. Decompose this relation to BCNF 17.16
Advanced Normal
Forms
Chittaranjan Pradhan

Database Management MVD(Multi-Valued


Dependency)

System 18 4NF(Fourth Normal


Form)

Advanced Normal Forms


JD(Join Dependency)

5NF(Fifth Normal
Form)

Denormalization

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
18.1
Advanced Normal
MVD(Multi-Valued Dependency) Forms
Chittaranjan Pradhan

MVD(Multi-Valued
Dependency)

MVD(Multi-Valued Dependency) 4NF(Fourth Normal


Form)

A table involves a multi-valued dependency if it may contain JD(Join Dependency)

multiple values for an entity 5NF(Fifth Normal


Form)

Denormalization

A multi-valued dependency A →→ B exists iff for every


occurrence of A; there exists multiple occurrences of B

If A →→ B and A →→ C, then we have a single attribute A


which multi-determines two other attributes, B and C

Multi-valued dependencies are also referred to as tuple


generating dependencies

18.2
Advanced Normal
MVD(Multi-Valued Dependency)... Forms
Chittaranjan Pradhan

Name Project Hobby


Asis Microsoft Reading MVD(Multi-Valued
Asis Oracle Music Dependency)

Asis Microsoft Music 4NF(Fourth Normal


Employee Asis Oracle Reading Form)

Bikash Intel Movies JD(Join Dependency)

Bikash Sybase Riding 5NF(Fifth Normal


Form)
Bikash Intel Riding
Bikash Sybase Movies Denormalization

MVDs are: Name →→ Project and Name →→ Hobby


MVD
An MVD X →→ Y in relation R is called a trivial MVD if:
• Y is a subset of X, or
• X∪Y=R

An MVD that satisfies neither the first nor the second condition
is called a nontrivial MVD

Normally, MVDs exist in pair


18.3
Advanced Normal
4NF(Fourth Normal Form) Forms
Chittaranjan Pradhan
4NF(Fourth Normal Form)
A relation is in 4NF iff the following two conditions are satisfied MVD(Multi-Valued
Dependency)
simultaneously:
4NF(Fourth Normal
• It is in 3NF Form)

JD(Join Dependency)
• It contains no multiple MVDs
5NF(Fifth Normal
Form)
In other words, a relation is in 4NF iff:
Denormalization
• There are no nontrivial MVDs in the relation, or
• The determinant of any nontrivial MVD in the relation is a
key

The previous relation is not in 4NF


Name Project
Asis Microsoft
Employee_Project Asis Oracle Name →→ Project
Bikash Intel
Bikash Sybase

Name Hobby
Asis Reading
Employee_Hobby Asis Music Name →→ Hobby
Bikash Riding
Bikash Movies

These relations are present in 4NF because the MVDs are 18.4
Advanced Normal
JD(Join Dependency) Forms
Chittaranjan Pradhan

MVD(Multi-Valued
Dependency)

4NF(Fourth Normal
Form)
JD(Join Dependency) JD(Join Dependency)

A relation R satisfies join dependency (R1 , R2 ... Rn ) iff: 5NF(Fifth Normal


Form)

• R is equal to the join of R1 , R2 ... Rn on the common Denormalization

attributes, where Ri is a subset of the relation R


That means R satisfies join dependency iff R= R1 o
n R2 o
n ... o
n
Rn

In other words, a join dependency is said to hold over a relation


R if R1 , R2 ... Rn is a lossless-join decomposition of R

18.5
Advanced Normal
5NF(Fifth Normal Form) Forms
Chittaranjan Pradhan
5NF(Fifth Normal Form)
A relation is in 5NF iff the following two conditions are satisfied MVD(Multi-Valued
Dependency)
simultaneously:
4NF(Fourth Normal
• It is in 4NF Form)

JD(Join Dependency)
• Every join dependency is implied by the candidate keys
5NF(Fifth Normal
Form)
In other words, a relation is in 5NF if it is in 4NF and the
Denormalization
decomposition is lossless type
Dealer Parts Customer
D1 P1 C1
Dealer D1 P1 C2
D1 P2 C1
D2 P1 C1

Dealer →→ Parts, Dealer →→ Customer


Dealer Parts
D1 P1
Dealer_Parts D1 P2
Dealer →→ Parts
D2 P1
Dealer Customer
Dealer_Customer D1
D1
C1
C2
Dealer →→ Customer
D2 C1

This decomposition is not in 5NF 18.6


Advanced Normal
5NF(Fifth Normal Form)... Forms
Chittaranjan Pradhan

Dealer Parts
MVD(Multi-Valued
D1 P1 Dependency)
Dealer_Parts Dealer →→ Parts
D1 P2 4NF(Fourth Normal
Form)
D2 P1 JD(Join Dependency)

5NF(Fifth Normal
Dealer Customer Form)

D1 C1 Denormalization
Dealer_Customer Dealer →→ Customer
D1 C2
D2 C1

Parts Customer
P1 C1
Parts_Customer Parts →→ Customer
P1 C2
P2 C1

Dealer_Parts o
n Parts_Customer o
n Dealer_Customer = Dealer

Thus, decomposition of Dealer to Dealer_Parts,


Parts_Customer and Dealer_Customer is in 4NF as well as in
5NF
18.7
Advanced Normal
Denormalization Forms
Chittaranjan Pradhan

MVD(Multi-Valued
Dependency)

Denormalization 4NF(Fourth Normal


Form)

JD(Join Dependency)
• Advantages of normalization:
5NF(Fifth Normal
• It removes data redundancy Form)

• It solves Insertion, Updation, and Deletion anomalies Denormalization

• This makes it easier to maintain in the database in a


consistent state
• Disadvantages of normalization:
• It leads to more tables in the database
• For retrieving the records or information, these tables need
to be joined back together, which is an expensive task
• Thus, sometimes it is worth denormalizing

18.8
Advanced Normal
Denormalization... Forms
Chittaranjan Pradhan

MVD(Multi-Valued
Dependency)
Once a normalized database design has been achieved,
4NF(Fourth Normal
adjustments can be made with the potential consequences Form)

(anomalies) in mind JD(Join Dependency)

5NF(Fifth Normal
Form)
Possible denormalization steps include the following: Denormalization

• Recombining relations that were split to satisfy


normalization rules
• Storing redundant data in tables
• Storing summarized data in tables
Denormalization is the opposite of Normalization

It is the process of increasing redundancy in the database


either for convenience or to improve performance

18.9
Transactions

Chittaranjan Pradhan

Database Management Transaction Concept

ACID Properties

System 19 Transaction States

Transactions

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
19.1
Transactions
Transaction Concept
Chittaranjan Pradhan

Transaction Concept
A transaction is a unit of program execution that accesses and Transaction Concept

possibly updates various data items ACID Properties

Transaction States

A transaction is a logical unit of work that contains one or more


SQL statements. The effects of all the SQL statements in a
transaction can be either all committed or all rolled back

A transaction that changes the contents of the database must


alter the database from one consistent database state to
another

19.2
Transactions
ACID Properties
Chittaranjan Pradhan

ACID Properties
Transaction Concept
• Atomicity: Either all operations of the transaction are ACID Properties
reflected properly in the database, or none are. Atomicity Transaction States

requires that all operations of a transaction be completed;


if not, the transaction is aborted by rolling back all the
updates done during the transaction

• Consistency: Consistency means execution of a


transaction should preserve the consistency of the
database, i.e. a transaction must transform the database
from one consistent state to another consistent state

• Isolation: Though multiple transactions may execute


concurrently, the system guarantees that, for every pair of
transactions Ti and Tj , it appears to Ti that either Tj
finished execution before Ti started or Tj started execution
after Ti finished. Thus, each transaction is unaware of
other transactions executing concurrently in the system
19.3
Transactions
ACID Properties...
Chittaranjan Pradhan

Transaction Concept
ACID Properties... ACID Properties

Transaction States
• Durability: After a transaction completes successfully, the
changes it has made to the database persist, even if there
are system failures. Durability ensures that once
transaction changes are done or committed, they can’t be
undone or lost, even in the event of a system failure

• The transactions access data item X using the following


two operations:
• Read(X): It transfers the data item X from the database to a
local buffer belonging to the transaction that executed the
read operation
• Write(X): It transfers the data item X from the local buffer of
the transaction that executed the write back to the database

19.4
Transactions
ACID Properties...
Chittaranjan Pradhan

Let T1 be a transaction that transfers $100 from account A to


Transaction Concept
account B ACID Properties

T1 Transaction States

Read(A);
A:=A-100;
Write(A);
Read(B);
B:=B+100;
Write(B);

1. Atomicity: The database system keeps track of the old


values of any data on which a transaction performs a write, and
if the transaction does not complete its execution, the database
system restores the old values to make it appear as though the
transaction never executed
• Ensuring atomicity is the responsibility of the database
system itself. It is handled by the transaction management
component

19.5
Transactions
ACID Properties...
Chittaranjan Pradhan
2. Consistency: Sum of A and B be unchanged by the
execution of the transaction
Transaction Concept
• Ensuring the consistency for an individual transaction is
ACID Properties
the responsibility of the application programmers who Transaction States
codes the transaction
3. Isolation: The database is temporarily inconsistent while the
transaction to transfer funds from account A to B is executing.
The solutions are:
• Execute transactions serially
• However, concurrent execution of transactions provides
significant performance benefits such as increased
throughputs
• Ensuring the isolation property is the responsibility of
concurrency control component of the database system
4. Durability: Once a transaction completes successfully, all
the updates that is carried out on the database persist, even if
there is a system failure after the transaction completes
execution
• Ensuring durability is the responsibility of recovery
management component of the database system 19.6
Transactions
Transaction States
Chittaranjan Pradhan
• Active state: This state is the initial state of a transaction.
The transaction stays in this state while it is executing Transaction Concept

ACID Properties

• Partial Committed state: A transaction is partial Transaction States

committed after its final statement has been executed. A


transaction enters this state immediately before the
commit statement

• Failed state: A transaction enters the failed state after the


discovery that normal execution can no longer proceed

• Aborted state: A transaction is aborted after it has been


rolled back and the database is restored to its prior state
before the transaction

• Committed state: Committed state occurs after


successful completion of the transaction

• Terminate: Transaction is either committed or aborted


19.7
Transactions
Transaction States...
Chittaranjan Pradhan

Transaction Concept

ACID Properties

Transaction States

When a transaction enters the aborted state, the system has


two options:
• Restart the transaction: If the transaction was aborted as
a result of a hardware failure or some software error (other
than logical error), it can be restarted
• Kill the transaction: If the application program that
initiated the transaction has some logical error
19.8
Concurrent Execution

Chittaranjan Pradhan

Database Management Concurrent Execution

Schedules

System 20 Serial Schedule


Concurrent Schedule

Serializability

Concurrent Execution Conflict Serializability


Testing for Conflict
Serializability
View Serializability

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
20.1
Concurrent Execution
Concurrent Execution
Chittaranjan Pradhan

Concurrent Execution
Concurrent Execution
Schedules
Concurrent execution of transactions means executing more Serial Schedule
Concurrent Schedule
than one transaction at the same time Serializability
Conflict Serializability
Testing for Conflict
In the serial execution, one transaction can start executing only Serializability
View Serializability
after the completion of the previous

The advantages of using concurrent execution of transactions


are:
• Improved throughput and resource utilization
• Reduced waiting time
The database system must control the interaction among the
concurrent transactions to prevent them from destroying the
consistency of the database. It does this through a variety of
mechanisms called concurrency control schemes

20.2
Concurrent Execution
Concurrent Execution...
Chittaranjan Pradhan

Concurrent Execution
T1 transfers Dollar $100 from account A to account B Schedules
T1 Serial Schedule
Concurrent Schedule
Read(A);
Serializability
A:=A-100; Conflict Serializability

Write(A); Testing for Conflict


Serializability

Read(B); View Serializability

B:=B+100;
Write(B);

T2 transfers 20% of balance from account A to account B


T2
Read(A);
Temp=0.2*A;
A:=A-Temp;
Write(A);
Read(B);
B:=B+Temp;
Write(B);

20.3
Concurrent Execution
Schedules
Chittaranjan Pradhan

Concurrent Execution

Schedules
Serial Schedule
Concurrent Schedule

Schedules Serializability
Conflict Serializability

A schedule is a sequence that indicates the chronological order Testing for Conflict
Serializability

in which instructions of concurrent transactions are executed View Serializability

A schedule for a set of transactions must consist of all


instructions of those transactions

We must preserve the order in which the instructions appear in


each individual transaction

20.4
Concurrent Execution
Serial Schedule
Chittaranjan Pradhan

Concurrent Execution

Schedules
Serial Schedule
Concurrent Schedule
Serial Schedule Serializability
Conflict Serializability
A serial schedule is a schedule where all the instructions Testing for Conflict
Serializability
belonging to each transaction appear together View Serializability

There is no interleaving of transaction operations. A serial


schedule has no concurrency and therefore it does not
interleave the actions of different transactions

For n transactions, there are exactly n! different serial


schedules possible

20.5
Concurrent Execution
Serial Schedule...
Chittaranjan Pradhan

T1 T2
Read(A);
A:=A-100; Concurrent Execution
Write(A); Schedules
Read(B); Serial Schedule
B:=B+100; Concurrent Schedule

Write(B);
Schedule1 (T1 followed by T2 ) Read(A);
Serializability
Conflict Serializability
Temp=0.2*A; Testing for Conflict
Serializability
A:=A-Temp;
View Serializability
Write(A);
Read(B);
B:=B+Temp;
Write(B);

T1 T2
Read(A);
Temp=0.2*A;
A:=A-Temp;
Write(A);
Read(B);
B:=B+Temp;
Schedule2 (T2 followed by T1 ) Write(B);
Read(A);
A:=A-100;
Write(A);
Read(B);
B:=B+100;
Write(B);

20.6
Concurrent Execution
Concurrent Schedule
Chittaranjan Pradhan

Concurrent Schedule Concurrent Execution

Schedules
In concurrent schedule, operations from different concurrent Serial Schedule

transactions are interleaved Concurrent Schedule

Serializability
Conflict Serializability

The number of possible schedules for a set of n transactions is Testing for Conflict
Serializability

much larger than n! View Serializability

T1 T2
Read(A);
A:=A-100;
Write(A);
Read(A);
Temp=0.2*A;
A:=A-Temp;
Schedule3 Write(A);
Read(B);
B:=B+100;
Write(B);
Read(B);
B:=B+Temp;
Write(B);

20.7
Concurrent Execution
Concurrent Schedule...
Chittaranjan Pradhan

T1 T2
Read(A);
A:=A-100; Concurrent Execution
Read(A); Schedules
Temp=0.2*A; Serial Schedule
A:=A-Temp; Concurrent Schedule

Write(A);
Schedule4 Read(B);
Serializability
Conflict Serializability
Write(A); Testing for Conflict
Serializability
Read(B);
View Serializability
B:=B+100;
Write(B);
B:=B+Temp;
Write(B);

T1 T2
Read(A);
A:=A-100;
Write(A);
Read(A);
Temp=0.2*A;
A:=A-Temp;
Schedule5 Read(B);
B:=B+100;
Write(A);
Read(B);
B:=B+Temp;
Write(B);
Write(B);

20.8
Concurrent Execution
Serializability
Chittaranjan Pradhan

Concurrent Execution

Serializability Schedules
Serial Schedule

A concurrent schedule is serializable if it is equivalent to a Concurrent Schedule

Serializability
serial schedule Conflict Serializability
Testing for Conflict
Serializability

Serial schedules preserve consistency as we assume each View Serializability

transaction individually preserves consistency

The database system must control concurrent execution of


transactions to ensure that the database state remains
consistent

Since the modifications are done in the local buffer, we can


ignore the operations other than Read and Write instructions
for easier understanding of the serializability

20.9
Concurrent Execution
Serializability...
Chittaranjan Pradhan

Concurrent Execution
T1 T2 T1 T2
Read(A); Read(A); Schedules
Serial Schedule
Write(A); Write(A); Concurrent Schedule
Read(B); Read(B);
Serializability
Write(B); Write(B); Conflict Serializability
Read(A); Read(A); Testing for Conflict
Write(A); Write(A); Serializability
View Serializability
Read(B); Read(B);
Write(B); Write(B);
Schedule1 Schedule2

T1 T2 T1 T2 T1 T2
Read(A); Read(A); Read(A);
Write(A); Read(A); Write(A);
Read(A); Write(A); Read(A);
Write(A); Read(B); Read(B);
Read(B); Write(A); Write(A);
Write(B); Read(B); Read(B);
Read(B); Write(B) Write(B);
Write(B); Write(B); Write(B);
Schedule3 Schedule4 Schedule5

20.10
Concurrent Execution
Conflict Serializability
Chittaranjan Pradhan

Conflict Serializability
Concurrent Execution
Conflict Serializability consists of conflicting operations
Schedules
Let us consider a schedule S in which there are two Serial Schedule

consecutive instructions, Ii and Ij of transactions Ti and Tj Concurrent Schedule

Serializability
respectively (i6=j) Conflict Serializability
Testing for Conflict
Serializability

If Ii and Ij access different data items, then we can swap Ii and View Serializability

Ij without affecting the results of any transactions in the


schedule. However, if Ii and Ij access the same data item Q,
then the order of the two instructions may matter:
• Case-1: Ii =Read(Q) and Ij =Read(Q):
• Order of Ii and Ij does not matter
• Case-2: Ii =Read(Q) and Ij =Write(Q):
• Order of Ii and Ij matters in a schedule
• Case-3: Ii =Write(Q) and Ij =Read(Q):
• Order of Ii and Ij matters in a schedule
• Case-1: Ii =Write(Q) and Ij =Write(Q):
• Order of Ii and Ij matters in a schedule
20.11
Concurrent Execution
Conflict Serializability...
Chittaranjan Pradhan

Concurrent Execution
Thus, Ii and Ij conflict if they are the instructions by different Schedules

transactions on the same data item, and at least one of these Serial Schedule
Concurrent Schedule

instructions is a write operation Serializability


Conflict Serializability
Testing for Conflict
Let Ii and Ij be consecutive instructions of a schedule S. If Ii Serializability
View Serializability
and Ij are instructions of different transactions and they do not
conflict, then we can swap the order of Ii and Ij to produce a
new schedule S’. Here, we expect S to be equivalent to S’

If a schedule S can be transformed into a schedule S’ by a


series of swaps of non-conflicting instructions, we say that S
and S’ are conflict equivalent

A concurrent schedule S is conflict serializable if it is


conflict equivalent to a serial schedule

20.12
Concurrent Execution
Conflict Serializability...
Chittaranjan Pradhan

Concurrent Execution

Schedules
Serial Schedule
Concurrent Schedule

Serializability
Conflict Serializability
T1 T2 T1 T2 T1 T2 Testing for Conflict
Serializability
Read(A); Read(A); Read(A); View Serializability
Write(A); Read(A); Write(A);
Read(B); Write(A); Read(B);
Write(B); Write(A); Read(A);
Read(A); Read(B); Write(A);
Write(A); Read(B); Read(B);
Read(B); Write(B) Write(B);
Write(B); Write(B); Write(B);
Schedule3 Schedule4 Schedule5

20.13
Concurrent Execution
Conflict Serializability...
Chittaranjan Pradhan
It is possible to have two schedules that produce the same
outcome, but that are not conflict equivalent
Concurrent Execution
T1 T5
Schedules
Read(A); Serial Schedule
A:=A-100; Concurrent Schedule
Write(A); Serializability
Read(B); Conflict Serializability
B:=B-200; Testing for Conflict
Serializability
Write(B);
View Serializability
Read(B);
B:=B+100;
Write(B);
Read(A);
A:=A+200;
Write(A);
Schedule6

T1 T5
Read(A);
Write(A);
Read(B);
Write(B);
Read(B);
Write(B);
Read(A);
Write(A);
Schedule6
20.14
Concurrent Execution
Testing for Conflict Serializability
Chittaranjan Pradhan

Testing for Conflict Serializability Concurrent Execution

Schedules
Construct a directed graph, called a precedence graph from S. Serial Schedule
Concurrent Schedule
This graph consists of a pair G = (V, E), where V is a set of Serializability
vertices and E is a set of edges Conflict Serializability
Testing for Conflict
Serializability
View Serializability
The set of vertices consists of all the transactions participating
in the schedule

The set of edges consists of all edges Ti → Tj for which one of


three conditions holds:
• Ti executes write(Q) before Tj executes read(Q)
• Ti executes read(Q) before Tj executes write(Q)
• Ti executes write(Q) before Tj executes write(Q)
If an edge Ti → Tj exists in the precedence graph, then in any
serial schedule S’ equivalent to S, Ti must appear before Tj

20.15
Concurrent Execution
Testing for Conflict Serializability...
Chittaranjan Pradhan
If the precedence graph for a concurrent schedule S has a
cycle, then that schedule is not conflict serializable. If the
Concurrent Execution
graph contains no cycles, then the schedule S is conflict
Schedules
serializable Serial Schedule
Concurrent Schedule

Serializability
Conflict Serializability
Testing for Conflict
Serializability
View Serializability

20.16
Concurrent Execution
View Serializability
Chittaranjan Pradhan

View Serializability
Concurrent Execution

The schedules S and S1 are said to be view equivalent if the Schedules


Serial Schedule
following three conditions are met: Concurrent Schedule

• For each data item Q, if transaction Ti reads the initial Serializability


Conflict Serializability
value of Q in schedule S, then it must also read the initial Testing for Conflict
Serializability
value of Q in schedule S1 View Serializability

• For each data item Q, the transaction that performs the


final Write(Q) operation in schedule S must also perform
the final Write(Q) operation in schedule S1
• For each data item Q, if transaction Ti executes Read(Q)
in schedule S, and if that value was produced by a
Write(Q) operation executed by transaction Tj ; then in
schedule S1 , the Read(Q) operation of Ti must also read
the value of Q that was produced by the same Write(Q)
operation of transaction Tj
A schedule S is view serializable if it is view equivalent to
a serial schedule
20.17
Concurrent Execution
View Serializability...
Chittaranjan Pradhan

Concurrent Execution

Schedules
T1 T2 T3 Serial Schedule
Concurrent Schedule
Read(Q);
Serializability
Schedule7 Write(Q); Conflict Serializability
Write(Q); Testing for Conflict
Serializability
Write(Q); View Serializability

• This schedule is view serializable


• This schedule is not conflict serializable
• Every conflict serializable schedule is also view
serializable, whereas all view serializable schedules are
not conflict serializable
• Every view serializable schedule, which is not conflict
serializable, has blind writes

20.18
Concurrency Control

Chittaranjan Pradhan

Database Management Need of Concurrency


Control

System 21 Lock-Based Protocols


Basic Rules for Locking
Working of Locking

Concurrency Control Locking Protocol

Chittaranjan Pradhan
School of Computer Engineering,
KIIT University
21.1
Concurrency Control
Need of Concurrency Control
Chittaranjan Pradhan

Need of Concurrency
Lost Update Problem Control

Lock-Based Protocols
This problem occurs when two transactions that access the Basic Rules for Locking
Working of Locking
same database items have their operations interleaved in a
Locking Protocol
way that makes the value of some database item incorrect

T1 T2
Read(A);
A:=A-100;
Read(A);
Temp=0.2*A;
A:=A-Temp;
Write(A);
Write(A);
Read(B);
B:=B+100;
Write(B);

21.2
Concurrency Control
Need of Concurrency Control...
Chittaranjan Pradhan

Need of Concurrency
Temporary Update(or Dirty Read) Problem Control

Lock-Based Protocols
This problem occurs when one transaction updates a database Basic Rules for Locking
Working of Locking
item and then the transaction fails due to some reason. The
Locking Protocol
updated item is accessed by another transaction before it is
changed back to its original value

T1 T2
Read(A);
A:=A-100;
Write(A);
Read(A);
Temp=0.2*A;
A:=A-Temp;
Write(A);
Read(B);
B:=B+100;

21.3
Concurrency Control
Need of Concurrency Control...
Chittaranjan Pradhan

Incorrect Summary Problem


Need of Concurrency
Control
If one transaction is calculating an aggregate summary Lock-Based Protocols
function on a number of records while other transactions are Basic Rules for Locking
Working of Locking
updating some of these records, the aggregate function may
Locking Protocol
calculate some values before they are updated and others after
they are updated

T1 T2
sum=0;
Read(A);
A:=A-100;
Write(A);
Read(A);
sum:=sum+A;
Read(B);
sum:=sum+B;
Read(B);
B:=B+100;
Write(B);

21.4
Concurrency Control
Lock-Based Protocols
Chittaranjan Pradhan

Lock-Based Protocols
Need of Concurrency
Locking is a procedure used to control concurrent access to Control

data. Locks enable a multi-user database system to maintain Lock-Based Protocols


Basic Rules for Locking
the integrity of transactions by isolating a transaction from Working of Locking

others executing concurrently Locking Protocol

Locking is one of the most widely used mechanisms to ensure


serializability

Data items can be locked in two modes:


• Shared lock or Read lock: If a transaction Ti has
obtained a shared mode lock(S) on data item Q, then Ti
can only read the data item Q, but cannot write on Q
• Exclusive lock or Write lock: If a transaction Ti has
obtained an exclusive mode lock(X) on data item Q, then
Ti can both read and write Q
A transaction must obtain a lock on a data item before it can
perform a read or write operation
21.5
Concurrency Control
Basic Rules for Locking
Chittaranjan Pradhan

Basic Rules for Locking


Need of Concurrency
Control
• If a transaction has a read lock on a data item, it can only Lock-Based Protocols
read the item; but cannot update its value Basic Rules for Locking
Working of Locking

• If a transaction has a read lock on a data item, other Locking Protocol

transactions can obtain read locks on the same data item,


but they cannot obtain any update lock on it
• If a transaction has a write lock on a data item, then it can
both read and update the value of that data item
• If a transaction has a write lock on a data item, then other
transactions cannot obtain either a read lock or a write
lock on that data item
A transaction requests a shared lock on data item Q by
executing the Lock-S(Q) instruction
Similarly, a transaction can request an exclusive lock through
the Lock-X(Q) instruction
A transaction can unlock a data item Q by the Unlock(Q)
instruction
21.6
Concurrency Control
Working of Locking
Chittaranjan Pradhan

Working of Locking
Need of Concurrency
Control
• All transactions that need to access a data item must first Lock-Based Protocols
acquire a read lock or write lock on the data item Basic Rules for Locking
Working of Locking
depending on whether it is a read only operation or not Locking Protocol

• If the data item for which the lock is requested is not


already locked, then the transaction is granted with the
requested lock immediately
• If the item is currently locked, the database system
determines what kind of lock is the current one. Also, it
finds out which type of lock is requested:
• If a read lock is requested on a data item that is already
under a read lock, then the request will be granted
• If a write lock is requested on a data item that is already
under a read lock, then the request will be denied
• Similarly; if a read lock or a write lock is requested on a data
item that is already under a write lock, then the request is
denied and the transaction must wait until the lock is
released

21.7
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

Working of Locking... Need of Concurrency


Control

• A transaction continues to hold the lock until it explicitly Lock-Based Protocols


Basic Rules for Locking
releases it either during the execution or when it Working of Locking

terminates Locking Protocol

• The effects of a write operation will be visible to other


transactions only after the lock is released
A concurrent schedule, which is conflict serializable to a serial
schedule, will always get the respective locks from the
concurrency control manager

But, if the concurrent schedule is not conflict serializable, the


requested locks will not be granted by the concurrency control
manager

However, in case of Incorrect Summary Problem, all the


requested locks will be granted resulting in incorrect values

21.8
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

T1 T2
Read(A);
Need of Concurrency
Write(A); Control
Read(A); Lock-Based Protocols
Schedule3 Write(A); Basic Rules for Locking
Working of Locking
Read(B);
Write(B); Locking Protocol

Read(B);
Write(B);

T1 T2 Concurrency-Control Manager
Lock-X(A)
Grant-X(A, T1 )
Read(A);
Write(A);
Unlock(A)
Lock-X(A)
Grant-X(A, T2 )
Read(A);
Write(A);
Unlock(A)
Lock-X(B)
Grant-X(B, T1 )
Read(B);
Write(B);
Unlock(B)
Lock-X(B)
Grant-X(B, T2 )
Read(B);
Write(B);
Unlock(B)
21.9
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

Need of Concurrency
Control

Lock-Based Protocols
T1 T2 Basic Rules for Locking
Working of Locking
Read(A);
Locking Protocol
Read(A);
Write(A);
Schedule4 Read(B);
Write(A);
Read(B);
Write(B);
Write(B);

T1 T2 Concurrency-Control Manager
Lock-X(A)
Grant-X(A, T1 )
Read(A);
Lock-X(A)

21.10
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

T1 T2
Need of Concurrency
Read(A); Control
Write(A);
Lock-Based Protocols
Read(A); Basic Rules for Locking
Schedule5 Read(B); Working of Locking

Write(A); Locking Protocol


Read(B);
Write(B);
Write(B);

T1 T2 Concurrency-Control Manager
Lock-X(A)
Grant-X(A, T1 )
Read(A);
Write(A);
Unlock(A)
Lock-X(A)
Grant-X(A, T2 )
Read(A);
Lock-X(B)
Grant-X(B, T1 )
Read(B);
Write(A);
Unlock(A)
Lock-X(B)

21.11
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

Need of Concurrency
Control
T1 T2
Read(A); Read(A); Lock-Based Protocols
A:=A-100; Read(B); Basic Rules for Locking

Write(A); Display(A+B); Working of Locking

Read(B); Locking Protocol


B:=B+100;
Write(B);

T1 T2
Lock-X(A); Lock-S(A);
Read(A); Read(A);
A:=A-100; Unlock(A);
Write(A); Lock-S(B);
Unlock(A); Read(B);
Lock-X(B); Unlock(B);
Read(B); Display(A+B);
B:=B+100;
Write(B);
Unlock(B);

21.12
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

T1 T2 Concurrency-Control Manager Need of Concurrency


Lock-X(A) Control
Grant-X(A, T1 ) Lock-Based Protocols
Read(A); Basic Rules for Locking
A:=A-100; Working of Locking
Write(A);
Locking Protocol
Unlock(A)
Lock-S(A)
Grant-S(A, T2 )
Read(A);
Unlock(A)
Lock-S(B)
Grant-S(B, T2 )
Read(B);
Unlock(B)
Display(A+B);
Lock-X(B)
Grant-X(B, T1 )
Read(B);
B:=B+100;
Write(B);
Unlock(B)

Though the concurrency control manager will not face any


problem in granting the locks, the above schedule gives
incorrect result for transaction T2

21.13
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

To solve the previous discussed problem, different alternative Need of Concurrency


solutions are possible. One solution can be by delaying the Control

Lock-Based Protocols
unlocking process. That means the unlocking is delayed to the Basic Rules for Locking

end of the transaction Working of Locking

Locking Protocol

Unfortunately, this type of locking can lead to an undesirable


situation
T1 T2
Lock-X(A); Lock-S(A);
Read(A); Read(A);
A:=A-100; Lock-S(B);
Write(A); Read(B);
Lock-X(B); Display(A+B);
Read(B); Unlock(A);
B:=B+100; Unlock(B);
Write(B);
Unlock(A);
Unlock(B);

21.14
Concurrency Control
Working of Locking...
Chittaranjan Pradhan
T1 T2 Concurrency-Control Manager
Lock-X(A)
Grant-X(A, T1 ) Need of Concurrency
Control
Read(A);
A:=A-100; Lock-Based Protocols
Basic Rules for Locking
Write(A); Working of Locking
Lock-S(A)
Locking Protocol

Since T1 is holding an exclusive-lock on A and T2 is requesting


a shared-lock on A, the concurrency control manager will not
grant the lock permission to T2 . Thus, T2 is waiting for T1 to
unlock A
T3
Lock-X(B);
Read(B);
B:=B-100;
Write(B);
Lock-X(A);
Read(A);
A:=A+100;
Write(A);
Unlock(B);
Unlock(A);
21.15
Concurrency Control
Working of Locking...
Chittaranjan Pradhan

T3 T2 Concurrency-Control Manager
Lock-X(B) Need of Concurrency
Grant-X(B, T3 ) Control
Read(B); Lock-Based Protocols
B:=B-100; Basic Rules for Locking
Write(B); Working of Locking
Lock-S(A)
Locking Protocol
Grant-S(A, T2 )
Read(A);
Lock-S(B);
Lock-X(A)

T2 is waiting for T3 to unlock B. Similarly, T3 is waiting for T2 to


unlock A. Thus, this is a situation where neither of these
transactions can ever proceed with its normal execution. This
type of situation is called deadlock

If we do not use locking, or if we unlock data items as soon as


possible after reading or writing them, we may get inconsistent
states

On the other hand, if we do not unlock a data item before


requesting a lock on another data item, deadlocks may occur
21.16
Concurrency Control
Locking Protocol
Chittaranjan Pradhan
Locking Protocol
When a transaction requests a lock on a data item in a Need of Concurrency
Control
particular mode, and no other transaction has put a lock on the Lock-Based Protocols
same data item in a conflicting mode, then the lock can be Basic Rules for Locking
Working of Locking
granted by the concurrency control manager Locking Protocol

However, we must take some precautionary measures to avoid


the following scenarios:
• Suppose a transaction T1 has a shared-mode lock on a
data item, and another transaction T2 requests an
exclusive-mode lock on that same data item. In this
situation, T2 has to wait for T1 to release the shared-mode
lock
• Suppose, another transaction T3 requests a shared-mode
lock on the same data item while T1 is holding a shared
lock on it. As the lock request is compatible with lock
granted to T1 , so T3 may be granted the shared-mode
lock. But, T2 has to wait for the release of the lock from
that data item
21.17
Concurrency Control
Locking Protocol...
Chittaranjan Pradhan

Locking Protocol
Need of Concurrency
Control
• At this point, T1 may release the lock, but still T2 has to
Lock-Based Protocols
wait for T3 to finish. There may be a new transaction T4 Basic Rules for Locking
Working of Locking
that requests a shared-mode lock on the same data item,
Locking Protocol
and is granted the lock before T3 releases it
• In such a situation, T2 never gets the exclusive-mode lock
on the data item. Thus, T2 cannot progress at all and is
said to be starved. This problem is called as the starvation
problem
We can avoid starvation of transactions by granting locks in the
following manner; when a transaction Ti requests a lock on a
data item Q in a particular mode M, the concurrency-control
manager grants the lock provided that:
• There is no other transaction holding a lock on Q in a
mode that conflicts with M
• There is no other transaction that is waiting for a lock on Q
and that made its lock request before Ti
21.18

You might also like