DBMS - Keys & Normalization
DBMS - Keys & Normalization
Normalization
What are keys?
• Keys allows you to establish a relationship between and identify the relation between tables
Keys
Candidate Composite
Super Key Primary Key Alternate Key Foreign key
Key key
Super Key
• A superkey is a set of one or more attributes that, taken collectively, allow us to identify uniquely a
tuple in the relation.
• A Super key may have additional attributes that are not needed for unique identification.
Super Key
• It is possible that several distinct sets of attributes could serve as a candidate key.
• Every table must have at least a single candidate key. A table can have multiple candidate keys.
Candidate Key
• Primary Key is a column or group of columns in a table that uniquely identify every row in that table.
• The Primary Key can't be a duplicate meaning the same value can't appear more than once in the
table.
• ALTERNATE KEYS is a column or group of columns in a table that uniquely identify every row in that
table.
• A table can have multiple choices for a primary key but only one can be set as the primary key.
• All the keys which are not primary key are called an Alternate Key.
ALTERNATE Key
• Composite Key is a combination of two or more columns that uniquely identify rows in a table.
E-102 P-1116 25
Composite Key
Foreign Key
• The purpose of Foreign keys is to maintain data integrity and allow navigation between two different
instances of an entity.
• It acts as a cross-reference between two tables as it references the primary key of another table.
Foreign Key
applicant employee
Primary Key
Foreign Key
Quick Review
Keys
• A DBMS key is an attribute or set of an attribute which helps you to identify a row(tuple) in a relation(table)
• A super key is a group of single or multiple attributes which identifies rows in a table.
• A super key with no repeated attribute is called candidate key or a minimal super keys are candidate keys. A
table can have multiple candidate keys.
• All the keys which are not primary key are called an alternate key
• A key which has multiple attributes to uniquely identify rows in a table is called a composite key
• Primary Key never accept null values while a foreign key may accept multiple null values.
• Keys allow you to establish a relationship between and identify the relation between tables
Normalization
Database Tables and Normalization
• Ideally, the database design process explored in Entity Relationship (ER) Modeling, yields good table
structures.
• Yet, it is possible to create poor table structures even in a good database design.
• How do you recognize a poor table structure, and
• Normalization is a process for evaluating and correcting table structures to minimize data
redundancies, thereby reducing the likelihood of data anomalies.
• Consider the simplified database activities of a construction company that manages several building
projects.
• Each project has its own project number, name, assigned employees, and so on.
• Each employee has an employee number, name, and job classification, such as engineer or computer
technician.
• The company charges its clients by billing the hours spent on each contract.
• For example, one hour of computer technician time is billed at a different rate than one hour of
engineer time.
Employee Project Details each project includes only a single occurrence
of any one employee.
Sample Project Layout
Project No Project Name Emp No Emp Name Job class Charge per hr Hours Billed Total Charge
15 Evergreen 103 Amit Verma Elect. Engineer 84.5 23.8 2011.1
101 Shubha Sinha Database Designer 105 19.4 2037
105 Rupa Mahajan Programmer 50 12.6 630
102 David Database Designer 105 35.7 3748.5
106 Arav Patil System Analyst 100 23.8 2380
18 AmberWave 114 Shaila Phatak Application Designer 48.1 24.6 1183.26
118 Ameya Chavan General Support 18.36 45.3 831.708
104 Reshma Singh System Analyst 100 32.4 3134.7
112 Amrit Shet DSS Analyst 45.95 44 2021.8
22 Rolling Tide 105 Rupa Mahajan Programmer 50 47.5 2375
104 Reshma Singh System Analyst 100 238.2 23045.85
113 Anna John Application Designer 48.1 85.4 4107.74
111 Delbert Clerical Support 26.87 34.3 921.641
106 Arav Patil System Analyst 100 94.6 9460
25 Star Flight 107 Maria Jones Programmer 50 24.6 1230
115 Travis Bawangi System Analyst 100 45.8 4431.15
101 Shubha Sinha Database Designer 105 56.3 5911.5
114 Shaila Phatak Application Designer 48.1 33 1587.3
108 Ralph Washington General Support 18.36 23.6 433.296
118 Ameya Chavan General Support 18.36 30.5 559.98
112 Amrit Shet DSS Analyst 45.95 41.2 1893.14
Impact of Data Redundancies
Employee Project Details - Sample Project Layout
• Update anomalies
• Modifying the JOB_CLASS for employee number 105 requires many potential alterations, one for each EMP_NUM =
105.
• Insertion anomalies
• Just to complete a row definition, a new employee must be assigned to a project.
• If the employee is not yet assigned, a phantom project must be created to complete the employee data entry.
• Deletion anomalies
• Suppose that only one employee is associated with a given project.
• If that employee leaves the company and the employee data is deleted, the project information will also be deleted.
• To prevent the loss of the project information, a fictitious employee must be created.
The Normalization Process
• The objective of normalization is to ensure that each table conforms to the concept of well-formed
relations—in other words, tables that have the following characteristics:
• Each table represents a single subject.
• Each table is void of insertion, update, or deletion anomalies, which ensures the integrity and consistency of
the data.
Normal Forms
Functional Dependence
• The attribute B is fully functionally dependent on the attribute A if each value of A determines one and
only one value of B.
• In this case, the attribute PROJ_NUM is known as the determinant attribute, and the attribute
PROJ_NAME is known as the dependent attribute.
Fully functional dependence (composite key)
• If attribute B is functionally dependent on a composite key A but not on any subset of that composite
key, the attribute B is fully functionally dependent on A.
Functional dependencies
• A partial dependency exists when there is a • A transitive dependency exists when there are
functional dependence in which the determinant functional dependencies such that X → Y, Y →
is only part of the primary key (remember the Z, and X is the primary key.
assumption that there is only one candidate
• In that case, the dependency X → Z is a
key).
transitive dependency because X determines
• For example, the value of Z via Y.
• If and (A, B) is the primary key and B → C, then
the functional dependence B → C is a partial
dependency because only part of the primary
key (B) is needed to determine the value of C.
Conversion To First Normal Form
• 1NF (Project No, Emp No, Project Name, Emp Name, Job class, Charge per hr, Hours Billed, Total Charge)
• Partial Dependencies
• (Project No → Project Name)
• (Emp No → Emp Name, Job class, Charge per hr, Hours Billed)
• Transitive Dependency
• ( Job class → Charge per hr)
Employee Project Details
A table in first normal form
Project No Project Name Emp No Emp Name Job class Charge per hr Hours Billed Total Charge
• Conversion to 2NF occurs only when the 1NF has a composite primary key.
• It includes no partial dependencies; that is, no attribute is dependent on only a portion of the primary key.
Employee Project Details
Tables in second normal form
Project No Project Name Emp No Emp Name Job class Charge per hr
Project No Emp No Hours Billed
15 Evergreen 103 Amit Verma Elect. Engineer 84.5 15 103 23.8
18 AmberWave 101 Shubha Sinha Database Designer 105 15 101 19.4
15 105 12.6
22 Rolling Tide 105 Rupa Mahajan Programmer 50 15 102 35.7
25 Star Flight 102 David Database Designer 105 15 106 23.8
106 Arav Patil System Analyst 100 18 114 24.6
18 118 45.3
Application
114 Shaila Phatak Designer 48.1 18 104 32.4
18 112 44
118 Ameya Chavan General Support 18.36 22 105 47.5
22 104 238.2
104 Reshma Singh System Analyst 100
22 113 85.4
112 Amrit Shet DSS Analyst 45.95 22 111 34.3
Application 22 106 94.6
113 Anna John Designer 48.1 25 107 24.6
111 Delbert Clerical Support 26.87 25 115 45.8
25 101 56.3
107 Maria Jones Programmer 50 25 114 33
115 Travis Bawangi System Analyst 100 25 108 23.6
25 118 30.5
108 Ralph Washington General Support 18.36 25 112 41.2
Conversion To Third Normal Form
Project No Project Name Emp No Emp Name Job class Job class Charge per hr Project No Emp No Hours Billed
15 Evergreen 103 Amit Verma Elect. Engineer 15 103 23.8
Elect. Engineer 84.5
18 AmberWave 15 101 19.4
101 Shubha Sinha Database Designer Database Designer 105 15 105 12.6
22 Rolling Tide 15 102 35.7
105 Rupa Mahajan Programmer Programmer 50
25 Star Flight 15 106 23.8
102 David Database Designer System Analyst 100
18 114 24.6
106 Arav Patil System Analyst Application Designer 48.1 18 118 45.3
General Support 18.36 18 104 32.4
114 Shaila Phatak Application Designer 18 112 44
DSS Analyst 45.95
118 Ameya Chavan General Support 22 105 47.5
Clerical Support 26.87 22 104 238.2
104 Reshma Singh System Analyst 22 113 85.4
112 Amrit Shet DSS Analyst 22 111 34.3
22 106 94.6
113 Anna John Application Designer 25 107 24.6
25 115 45.8
111 Delbert Clerical Support 25 101 56.3
25 114 33
107 Maria Jones Programmer
25 108 23.6
115 Travis Bawangi System Analyst 25 118 30.5
25 112 41.2
108 Ralph Washington General Support
Higher-Level Normal Forms
• A table is in Boyce-Codd normal form (BCNF) • A table is in fourth normal form (4NF) when it is
when every determinant in the table is a in 3NF and has no multivalued dependencies.
candidate key.
• Clearly, when a table contains only one
candidate key, the 3NF and the BCNF are
equivalent. In other words, BCNF can be violated
only when the table contains more than one
candidate key.
Thank You