0% found this document useful (0 votes)
91 views20 pages

Chapter 5 Relational Data Model

The document discusses the relational data model, which uses tables to store relations between entities. Some key points: - Relations are stored in tables with rows representing tuples (records) and columns representing attributes. - Each tuple is uniquely identified by a candidate key (or relation key). - Attributes have defined domains or value scopes. - Integrity constraints like entity constraints and referential integrity help ensure valid data. - The relational model is mapped from ER diagrams, with entities as tables and relationships as linking tables. - Functional dependencies define relationships between attributes like a primary key determining other attributes. - Normalization removes anomalies through techniques like 1NF, 2NF and 3NF.

Uploaded by

Hildana Tamrat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views20 pages

Chapter 5 Relational Data Model

The document discusses the relational data model, which uses tables to store relations between entities. Some key points: - Relations are stored in tables with rows representing tuples (records) and columns representing attributes. - Each tuple is uniquely identified by a candidate key (or relation key). - Attributes have defined domains or value scopes. - Integrity constraints like entity constraints and referential integrity help ensure valid data. - The relational model is mapped from ER diagrams, with entities as tables and relationships as linking tables. - Functional dependencies define relationships between attributes like a primary key determining other attributes. - Normalization removes anomalies through techniques like 1NF, 2NF and 3NF.

Uploaded by

Hildana Tamrat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Relational data model

Relational data model is the primary data model, which is used widely
around the world for data storage and processing. This model is simple and
it has all the properties and capabilities required to process data with
storage efficiency.

Concepts of Relational Data Model


Tables − In relational data model, relations are saved in the format of
Tables. This format stores the relation among entities. A table has rows and
columns, where rows represents records and columns represent the
attributes.

Tuple − A single row of a table, which contains a single record for that
relation is called a tuple.

Relation instance − A finite set of tuples in the relational database system


represents relation instance. Relation instances do not have duplicate
tuples.

Relation schema − A relation schema describes the relation name (table


name), attributes, and their names.

Relation key − Each row has one or more attributes, known as relation
key, which can identify the row in the relation (table) uniquely.

Attribute domain − Every attribute has some pre-defined value scope,


known as attribute domain.

Constraints
Every relation has some conditions that must hold for it to be a valid
relation. These conditions are called Relational Integrity Constraints.
There are three main integrity constraints −

 Key constraints

 Domain constraints

 Referential integrity constraints


Key Constraints

There must be at least one minimal subset of attributes in the relation,


which can identify a tuple uniquely. This minimal subset of attributes is
called key for that relation. If there are more than one such minimal
subset, these are called candidate keys.

Key constraints force that −

 In a relation with a key attribute, no two tuples can have identical values for key
attributes.

 A key attribute can not have NULL values.

Key constraints are also referred to as Entity Constraints.

Domain Constraints

Attributes have specific values in real-world scenario. For example, age can
only be a positive integer. The same constraints have been tried to employ
on the attributes of a relation. Every attribute is bound to have a specific
range of values. For example, age cannot be less than zero and telephone
numbers cannot contain a digit outside 0-9.

Referential integrity Constraints

Referential integrity constraints work on the concept of Foreign Keys. A


foreign key is a key attribute of a relation that can be referred in other
relation.

Referential integrity constraint states that if a relation refers to a key


attribute of a different or same relation, then that key element must exist.
Mapping ER Model to Relational Model
ER Model, when conceptualized into diagrams, gives a good overview of
entity-relationship, which is easier to understand. ER diagrams can be
mapped to relational schema, that is, it is possible to create relational
schema using ER diagram.

ER diagrams mainly comprise of −

 Entity and its attributes

 Relationship, which is association among entities.

Mapping Entity

An entity is a real-world object with some attributes.

Mapping Process (Algorithm)

 Create table for each entity.

 Entity's attributes should become fields of tables with their respective data types.

 Declare primary key.

Mapping Relationship

A relationship is an association among entities.


Mapping Process

 Create table for a relationship.

 Add the primary keys of all participating Entities as fields of table with their
respective data types.

 If relationship has any attribute, add each attribute as field of table.

 Declare a primary key composing all the primary keys of participating entities.

 Declare all foreign key constraints.

Mapping Weak Entity Sets


A weak entity set is one which does not have any primary key associated
with it.

Mapping Process

 Create table for weak entity set.

 Add all its attributes to table as field.

 Add the primary key of identifying entity set.

 Declare all foreign key constraints.


Mapping Hierarchical Entities

ER specialization or generalization comes in the form of hierarchical entity


sets.

Mapping Process

 Create tables for all higher-level entities.

 Create tables for lower-level entities.

 Add primary keys of higher-level entities in the table of lower-level entities.

 In lower-level tables, add all other attributes of lower-level entities.

 Declare primary key of higher-level table and the primary key for lower-level
table.

 Declare foreign key constraints.


Functional Dependency
Functional dependency (FD) is a set of constraints between two attributes in
a relation. It is a relationship between two attributes, typically between the
PK and other non-key attributes within a table. For any relation R, attribute
Y is functionally dependent on attribute X (usually the PK), if for every valid
instance of X, that value of X uniquely determines the value of Y.

Functional dependency is represented by an arrow sign (→) that is, X→Y,


where X functionally determines Y. The left-hand side attributes determine
the values of attributes on the right-hand side.

Example

This case represents an example where multiple functional dependencies are


embedded in a single representation of data. Note that because an employee
can only be a member of one department, the unique ID of that employee
determines the department.

 Employee ID → Employee Name


 Employee ID → Department ID
In addition to this relationship, the table also has a functional dependency
through a non-key attribute

 Department ID → Department Name

Types of Functional Dependencies


Trivial functional dependency

The dependency of an attribute on a set of attributes is known as trivial


functional dependency if the set of attributes includes that attribute.
Symbolically: A ->B is trivial functional dependency if B is a subset of A.

The following dependencies are also trivial: A->A & B->B


For example: Consider a table with two columns Student_id and
Student_Name.
{Student_Id, Student_Name} -> Student_Id

is a trivial functional dependency as Student_Id is a subset of {Student_Id,


Student_Name}. That makes sense because if we know the values of
Student_Id and Student_Name then the value of Student_Id can be uniquely
determined. Also, Student_Id -> Student_Id & Student_Name ->
Student_Name are trivial dependencies too.
Non trivial functional dependency
If a functional dependency X->Y holds true where Y is not a subset of X then
this dependency is called non trivial Functional dependency.
For example:
An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)

Multivalued dependency
Multivalued dependency occurs when there are more than
one independent multivalued attributes in a table.For example: Consider
a bike manufacture company, which produces two colors (Black and white)
in each model every year.

Here columns manuf_year and color are independent of each other and
dependent on bike_model. In this case these two columns are said to be
multivalued dependent on bike_model. These dependencies can be
represented like this:

bike_model ->> manuf_year


Transitive dependency
A functional dependency is said to be transitive if it is indirectly formed by
two functional dependencies.
For e.g.
X -> Z is a transitive dependency if the following three functional
dependencies hold true:

 X->Y
 Y does not ->X
 Y->Z

Note: A transitive dependency can only occur in a relation of three of more


attributes.

{Book} ->{Author} (if we know the book, we knows the author name)

{Author} does not ->{Book}

{Author} -> {Author_age}

Normalization
If a database design is not perfect, it may contain anomalies, which are like
a bad dream for any database administrator. Managing a database with
anomalies is next to impossible.

 Update anomalies − If data items are scattered and are not linked to each
other properly, then it could lead to strange situations. For example, when we
try to update one data item having its copies scattered over several places, a
few instances get updated properly while a few others are left with old values.
Such instances leave the database in an inconsistent state.
 Deletion anomalies − We tried to delete a record, but parts of it was left
undeleted because of unawareness, the data is also saved somewhere else.

 Insert anomalies − We tried to insert data in a record that does not exist at
all.

Normalization is a method to remove all these anomalies and bring the


database to a consistent state.

First Normal Form


First Normal Form is defined in the definition of relations (tables) itself. This
rule defines that all the attributes in a relation must have atomic domains.
The values in an atomic domain are indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal


Form.

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form


Before we learn about the second normal form, we need to understand the
following −

 Prime (key) attribute − An attribute, which is a part of the candidate-key, is


known as a prime attribute.
 Non-prime (non-key) attribute − An attribute, which is not a part of the
prime-key, is said to be a non-prime attribute.

If we follow second normal form, then every non-prime attribute should be


fully functionally dependent on prime key attribute

We see here in Student_Project relation that the prime key attributes are
Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e.
Stu_Name and Proj_Name must be dependent upon both and not on any of
the prime key attribute individually. But we find that Stu_Name can be
identified by Stu_ID and Proj_Name can be identified by Proj_ID
independently. This is called partial dependency, which is not allowed in
Second Normal Form.

We broke the relation in two as depicted in the above picture. So there


exists no partial dependency.

Third Normal Form


For a relation to be in Third Normal Form, it must be in Second Normal form
and the following must satisfy −

 No non-prime attribute is transitively dependent on prime key attribute.


 For any non-trivial functional dependency, X → A, then either −

o X is a super-key or,
o A is prime attribute.

We find that in the above Student_detail relation, Stu_ID is the key and
only prime key attribute. We find that City can be identified by Stu_ID as
well as Zip itself. Neither Zip is a superkey nor is City a prime attribute.
Additionally, Stu_ID → Zip → City, so there exists transitive dependency.

If we know the zip code 20001, we can determine the city is Washington DC.

To bring this relation into third normal form, we break the relation into two
relations as follows −

Relational Algebra
Relational database systems are expected to be equipped with a query
language that can assist its users to query the database instances. There
are two kinds of query languages − relational algebra and relational
calculus.

Relational Algebra
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It uses
operators to perform queries. An operator can be either unary or binary.
They accept relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation and intermediate
results are also considered relations.

The fundamental operations of relational algebra are as follows −

 Select

 Project

 Union

 Set different

 Cartesian product

 Rename

We will discuss all these operations in the following sections.

Select Operation (σ)

It selects tuples that satisfy the given predicate from a relation.


Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is
prepositional logic formula which may use connectors like and, or, and not.
These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example −

σsubject = "database"(Books)

Output − Selects tuples from books where subject is 'database'.

σsubject = "database" and price = "450"(Books)

Output − Selects tuples from books where subject is 'database' and 'price' is
450.

σsubject = "database" and price = "450" or year > "2010"(Books)

Output − Selects tuples from books where subject is 'database' and 'price' is
450 or those books published after 2010.
Project Operation (∏)
It projects column(s) that satisfy a given predicate.

Notation − ∏A1, A2, An (r)

Where A1, A2 , An are attribute names of relation r.

Duplicate rows are automatically eliminated, as relation is a set.

For example −

∏subject, author (Books)


Selects and projects columns named as subject and author from the
relation Books.

Union Operation (∪)

It performs binary union between two given relations and is defined as −

r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s

Where r and s are either database relations or relation result set


(temporary relation).

For a union operation to be valid, the following conditions must hold −

 r, and s must have the same number of attributes.

 Attribute domains must be compatible.

 Duplicate tuples are automatically eliminated.

∏ author (Books) ∪ ∏ author (Articles)


Output − Projects the names of the authors who have either written a book
or an article or both.

Set Difference (−)

The result of set difference operation is tuples, which are present in one
relation but are not in the second relation.

Notation − r − s
Finds all the tuples that are present in r but not in s.

∏ author (Books) − ∏ author (Articles)


Output − Provides the name of authors who have wri en books but not ar cles.

Cartesian Product (Χ)

Combines information of two different relations into one.

Notation − r Χ s

Where r and s are relations and their output will be defined as −

r Χ s = { q t | q ∈ r and t ∈ s}

σauthor = 'Elmasri'(Books Χ Articles)

Output − Yields a relation, which shows all the books and articles written
by Elmasri.

Rename Operation (ρ)

The results of relational algebra are also relations but without any name.
The rename operation allows us to rename the output relation. 'rename'
operation is denoted with small Greek letter rho ρ.

Notation − ρ x (E)

Where the result of expression E is saved with name of x.

We understand the benefits of taking a Cartesian product of two relations,


which gives us all the possible tuples that are paired together. But it might
not be feasible for us in certain cases to take a Cartesian product where we
encounter huge relations with thousands of tuples having a considerable
large number of attributes.

DBMS - Joins

Join is a combination of a Cartesian product followed by a selection


process. A Join operation pairs two tuples from different relations, if and
only if a given join condition is satisfied.

We will briefly describe various join types in the following sections.


Theta (θ) Join
Theta join combines tuples from different relations provided they satisfy the
theta condition. The join condition is denoted by the symbol θ.

Notation

R1 ⋈θ R2

R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,..
,Bn) such that the attributes don’t have anything in common, that is R1 ∩
R2 = Φ.

Theta join can use all kinds of comparison operators.

Student

SID Name Std

101 Alex 10

102 Maria 11

Subjects

Class Subject

10 Math

10 English

11 Music

11 Sports
Student_Detail −

STUDENT ⋈Student.Std = Subject.Class SUBJECT

Student_detail

SID Name Std Class Subject

101 Alex 10 10 Math

101 Alex 10 10 English

102 Maria 11 11 Music

102 Maria 11 11 Sports

Equijoin

When Theta join uses only equality comparison operator, it is said to be


equijoin. The above example corresponds to equijoin.

Natural Join ( ⋈ )

Natural join does not use any comparison operator. It does not concatenate
the way a Cartesian product does. We can perform a Natural Join only if
there is at least one common attribute that exists between two relations. In
addition, the attributes must have the same name and domain.

Natural join acts on those matching attributes where the values of


attributes in both the relations are same.
Courses

CID Course Dept

CS01 Database CS

ME01 Mechanics ME

EE01 Electronics EE

HoD

Dept Head

CS Alex

ME Maya

EE Mira

Courses ⋈ HoD

Dept CID Course Head

CS CS01 Database Alex

ME ME01 Mechanics Maya


EE EE01 Electronics Mira

Outer Joins

Theta Join, Equijoin, and Natural Join are called inner joins. An inner join
includes only those tuples with matching attributes and the rest are
discarded in the resulting relation. Therefore, we need to use outer joins to
include all the tuples from the participating relations in the resulting
relation. There are three kinds of outer joins − left outer join, right outer
join, and full outer join.

Left Outer Join(R S)

All the tuples from the Left relation, R, are included in the resulting relation.
If there are tuples in R without any matching tuple in the Right relation S,
then the S-attributes of the resulting relation are made NULL.

Left

A B

100 Database

101 Mechanics

102 Electronics

Right

A B

100 Alex
102 Maya

104 Mira

Courses HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya

Right Outer Join: ( R S)

All the tuples from the Right relation, S, are included in the resulting
relation. If there are tuples in S without any matching tuple in R, then the
R-attributes of resulting relation are made NULL.

Courses HoD

A B C D

100 Database 100 Alex

102 Electronics 102 Maya

--- --- 104 Mira

Full Outer Join: ( R S)


All the tuples from both participating relations are included in the resulting
relation. If there are no matching tuples for both relations, their respective
unmatched attributes are made NULL.

Courses HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya

--- --- 104 Mira

You might also like