Lesson 03 Database Normalization and Entity Relationship (ER) Model
Lesson 03 Database Normalization and Entity Relationship (ER) Model
Lesson 03 Database Normalization and Entity Relationship (ER) Model
ER Model
Entity set and relationship set are the two primary components of the E-R
model.
For example: Suppose you design an HR database, the employee will be an entity with the following
attributes:
E_Name E_Address
Employee
E_Id E_Department
The address can be another entity with attributes, such as country, city, landmark, street
name, and pin code, and there will be a relationship between them.
Components of ER Diagram
Components of (ER) Diagram
Components
of ER-
Diagram
Entity Relationship
Attribute
Entity
Entity
Any object, class, or component of data can be considered an entity. In an ER diagram, entities are
represented by a rectangle.
Weak Entity: A weak entity is one that is reliant on another entity. There are no key attributes in the
weak entity. A weak entity is represented by a double rectangle.
The properties of entities are known as attributes and an attribute is represented by ellipses.
3
Multivalued Attribute
Key Attributes
A key attribute can be used to identify one entity from a group of entities.
E_ID Department
Age Address
Composite Attributes
Country
Zip Code
Multivalued Attributes
A multivalued attribute is a kind of attribute that has multiple values and is represented by a
double oval.
Quantity
Order Payment
C_ID
Derived Attributes
Derivative attributes are attributes that can be derived from other attributes. A dashed ellipse can be
used to illustrate this.
Employee
Emp_No Experience
Relationship
Relationship
Relationship: A relationship is a term used to describe the connection between two or more entities.
The relationship is represented as a diamond or rhombus.
E1 W1 D1
E2 W2 D2
E3 W3 D3
• The above relationship set depicts that E1 works in D2, E2 works in D3, and E3 works in D1.
• W1, W2, and W3 represent the relationship between employees and departments, indicating
that each employee is associated with a department.
Relationship Degree
Relationship Degree
Relationship Degree: The degree of a relationship is determined by the number of entities that
participate in it.
The three most common degrees of relationships in ER models are :
Binary
Unary Ternary
Binary Relationship
The most common type of relationship is a binary relationship, which involves two entities.
When both partners in a relationship are the same entity, the relationship is said to be unary.
Subject
Prerequisite
of
Ternary Relationship
Organization
A one-to-one relationship exists when a single instance of one entity is connected with a single instance
of another entity.
1 1
Employee has Emp_ID
One-to-Many (1:M)
When a single instance of one entity is connected to several instances of another entity, this is referred
to as a one-to-many relationship.
1 M
Customer places Orders
Many-to-One (M:N)
A many-to-one relationship exists when several instances of one entity are connected to a single
instance of another entity.
M 1
Employees works Department
Many-to-Many (M:M)
A many-to-many relationship occurs when more than one instance of an entity is connected to several
instances of another entity.
M M
Employees assigned Project
Assisted Practice: ER Diagram
Duration: 20 mins
Problem statement:
E-commerce is one of the emerging fields and is widely accepted across the globe. Design an ER model
for a start-up named "Sell in the Sale" based on the below business rules so that it is easy for the
management to understand the design of their company's database. A salesperson can manage many
customers. A customer, however, is managed by only one salesperson. One customer can place
multiple orders, but each order will always belong to single customer. An order lists many products, and
a single product can be present in multiple orders. An order will have the columns orderID, orderDate,
noOfProducts, and productName. Make sure to follow proper notations to represent the entities,
attributes, and relationships with their types (1:1, 1:M, M:1, and M:M).
Objective:
Identify the entities, attributes, and relationships in order to solve this problem
Assisted Practice: ER Diagram
Steps to be performed:
Salesperson Customer
Product Order
Assisted Practice: ER Diagram
Order
orderID productName
orderDate noOfProducts
Assisted Practice: ER Diagram
Relationships are used to document the interaction between the entities. It is represented as
a diamond.
A salesperson can manage many customers. A customer is managed by only one salesperson.
Hence, the relationship here is Manages and one-to-many (1:M).
Multiple orders can be placed by a customer, but an order will always belong to a single customer.
Hence the relationship here is Places and one-to-many (1:M).
An order lists many products, and a single product can be present in multiple orders.
Hence the relationship here is Lists and many-to-many (M:M).
Assisted Practice: ER Diagram
Places
M
M M
Product Lists Order
Assisted Practice: ER Diagram
Step 04: Using the information gathered in steps 1, 2, and 3, create an ER diagram
Places
M
M M
Product Lists Order
orderID productName
orderDate noOfProducts
Mapping Cardinalities
Cardinalities
Each developer in an organization can work on an indefinite number of projects as long as his or her
weekly hours do not exceed 40.
M N
Developer develops Project
(0,N) (0,1)
Developers may develop 0 projects if they are involved in nontechnical projects. Therefore, the
cardinality limits for the developer are (O,N).
Cardinalities
According to the organization's regulations, each project is developed by a single developer, but it is
possible to have projects that have not yet been assigned to a developer.
M N
Developer develops Project
(0,N) (0,1)
Cardinality is represented by the styling of a line and its endpoint, according to the chosen
notation style:
Database Normalization
Database Normalization
• It helps in the division of large database tables into smaller ones and
establishes relationships between them.
Types of Anomalies
Types of Anomalies
Data redundancy occurs when two or more rows or columns have the same or repeated
value, causing the memory to be used inefficiently.
The records of the two employees, John and Ritchie are repetitive in the above table, which
results in data redundancy.
Insert Anomaly
Insert anomaly occurs when some attributes or data items are to be inserted into the
database without the existence of other attributes.
If we want to enter a new EmpID into the employee table, we must wait till the employee
joins the organization. Hence, it is called insertion anomalies.
Update Anomaly
Update anomaly occurs when duplicate data is updated only in one location and not in
other instances. As a result, the data becomes inconsistent.
In the above table, there is an employee named John. If we change the department in the employee
database, we must also change it in the department database; otherwise, the data will be inconsistent.
Delete Anomaly
Delete anomaly occurs when certain entries are lost or deleted from a database table as a
result of the deletion of other records.
If we delete Bolt from the above table, we also remove his address and other data
from the table. As a result, we may argue that removing certain attributes might result
in the removal of other attributes from the database table.
Types of Normalization
Types of Normalization
• The 1NF states that all the attributes in a relation must have atomic domains.
• The 1NF specifies that a table attribute cannot have multiple values.
• In 1NF, multivalued attribute, composite attribute, and their combination are not allowed.
First Normal Form (1NF)
The table on the left consists of employees who belongs to different departments. Example- John.
Normalization is achieved by splitting this record into two different rows.
Table without first normal form Table with first normal form
Second Normal Form (2NF)
• All attributes inside it should be based entirely on the entity's unique identifier.
• To remove the dependency, we can divide the table, remove the attribute that
causes the dependency, and add it to the other table where it fits.
Second Normal Form (2NF)
The first table is a course table with the details of course name, teacher age, and teacher ID.
Course table without the second Teacher detail table with Course table with the
normal form the second normal form second normal form
In the given table, brand is dependent on product ID, a proper subset of the candidate key that
violates the rules of 2NF.
Second Normal Form (2NF)
To convert the given table into 2NF, let’s decompose it into two tables:
8989 SQL
• No column entry should be dependent on any other entry (value) than the
table's key.
• 3NF is used to reduce redundancy in the data and to ensure data consistency.
Third Normal Form (3NF)
The given tables have employee details along with their ratings:
To determine the hike percentage, HR needs to create a new column in the ratings table named as hike.
Third Normal Form (3NF)
Ratings table
Boyce and Codd Normal Form (BCNF)
• All the tables in the database should have just one primary key.
• For every functional dependency A->B, A should be the super key of the table.
Boyce and Codd Normal Form (BCNF)
In this example you have an employee table with the following details:
The table is not in BCNF as neither E_ID nor E_Dept alone are keys.
Boyce and Codd Normal Form (BCNF)
To make the table comply with BCNF, divide it into three different tables:
This is how you achieve BCNF as the functional dependencies in the left side part is a key.
Assisted Practice: Normal Forms
Duration: 20 mins
Problem statement: You have just joined an organization as a database administrator. You have been
trained on database basics, which includes normalization, and to test your knowledge, you have been
asked to solve two questions: A and B.
Assisted Practice: Normal Forms
Data - This data qualifies as 1NF. Is this statement true or false? If false, transform it to 1NF.
4 Metallica Metal
Steps to be performed:
The answer to question A is false.
Note:
The rule of 1NF states that all attributes in a relation must have atomic domains. A table attribute
cannot have multiple values. Multivalued attributes, composite attributes, and their combination are
not allowed.
The table structure violates the rule of atomicity, i.e., a single cell holds multiple values in the
SongCategory column for SongID=3.
Assisted Practice: Normal Forms
Output:
3 Eminem Rap
4 Metallica Metal
Duration: 20 mins
Question B:
Data - This data qualifies to be in 2NF. Is this statement true or false? If false, transform it to 2NF.
Assume (SongID, ArtistName) is the primary key for the table.
3 MNO Eminem
4 XYZ Metallica
Steps to be performed:
The answer to the question is false.
Note:
The 2NF rule states that the entity should already be in 1NF. All attributes inside it should be based
entirely on the entity's unique identifier. To remove the dependency, divide the table, remove the
attribute that causes the dependency, and add it to the artist table, where it fits.
The table structure and violates the rule of partial dependency. The SongName column can be
determined by SongID which makes this relationship partially dependent.
Partial dependency is when a nonprime attribute (SongName) is functionally dependent on a part of a
key (SongID and ArtistName).
Assisted Practice: Normal Forms
Output:
A. Weak entity
B. Strong entity
C. Attribute
D. Relationship
Knowledge
Check
A ___________ is one that is reliant on another entity.
1
A. Weak entity
B. Strong entity
C. Attribute
D. Relationship
A. Strong entity
B. Attributes
C. Weak entity
D. Relationship
Knowledge
Check
The properties of entities are known as _________.
2
A. Strong entity
B. Attributes
C. Weak entity
D. Relationship
A. Entity
B. Attributes
C. Department
D. Relationships
Knowledge
Check
A relationship set is a group of similar types of ___________.
3
A. Entity
B. Attributes
C. Department
D. Relationships
Problem statement:
You are working as a junior DBA for a gaming company that is looking for a
DB design to capture the scenario of the model cricket teams, the games they
play, and the players in each team.
Objective:
Design a detailed ER diagram to capture the scenarios for the player, team,
match and umpire.
Tasks to be performed:
1. Design an ER diagram that depicts teams, where each team’s ID (unique
identifier), name, main stadium, and city are listed
2. Design an ER diagram that depicts players (each player can only play for
one team), where each has an ID (unique identifier), name, date of birth, shirt
number, and year of start
Lesson-End Project: Cricket Team Model
Tasks to be performed:
3. Design an ER diagram that depicts matches, where each will have a date,
final result, city (where a match was played)
4. Design an ER diagram that depicts umpires (each match has one umpire),
where each has an ID (unique identifier), name, date of birth, and years of
experience
Note: You can state any assumptions that affect your design
Key Takeaways