0% found this document useful (0 votes)
9 views

Physical Database Design Lecture Slides

Uploaded by

Muskan Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Physical Database Design Lecture Slides

Uploaded by

Muskan Zahra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 113

Physical Database Design

Introduction

● Logical database design is the process of transforming the conceptual data


model into a logical data model.
● Conceptual data modeling is about understanding the organization—getting
the requirements right.
● Logical database design is about creating stable database structures—
correctly expressing the requirements in a technical language.
Relational Data Model

● The relational data model was first introduced in 1970 by E. F. Codd, then
of IBM (Codd, 1970).
● The relational data model represents data in the form of tables.
● The relational model is based on mathematical theory and therefore has a
solid theoretical foundation.
Relational Data Model

● The relational data model consists of the following three components:


1. Data structure: Data are organized in the form of tables, with rows and
columns.
2. Data manipulation: Powerful operations (typically implemented using
the SQL language) are used to manipulate data stored in the relations.
3. Data integrity: The model includes mechanisms to specify business
rules that maintain the integrity of data when they are manipulated.
Relation

● A relation is a named, two-dimensional table of data.


● Each relation (or table) consists of a set of named columns and an arbitrary
number of unnamed rows.
Relation
Relation

● We can express the structure of a relation by using a shorthand notation in


which the name of the relation is followed (in parentheses) by the names of
the attributes in that relation.
Relational Keys

● Primary key: An attribute or a combination of attributes that uniquely


identifies each row in a relation.
Relational Keys

● The attribute or a collection of attributes indicated as an entity’s identifier in


an E-R diagram may be the same attributes that comprise the primary key
for the relation representing that entity.
● There are exceptions: For example, associative entities do not have to have
an identifier, and the (partial) identifier of a weak entity forms only part of a
corresponding relation’s primary key.
Relational Keys

● Composite key: A primary key that consists of more than one attribute.
● For example, the primary key for a relation DEPENDENT would likely
consist of the combination EmpID and DependentName.
Relational Keys

● Foreign key: An attribute in a relation that serves as the primary key of


another relation in the same database.
● For example, consider the relations EMPLOYEE1 and DEPARTMENT:
Relational Keys

● The attribute DeptName is a foreign key in EMPLOYEE1. It allows a user


to associate any employee with the department to which he or she is
assigned.
● The foreign key can be represented using dashed underline.
Properties of Relations

● Relations have several properties that distinguish them from non-relational


tables.
1. Each relation (or table) in a database has a unique name.
2. An entry at the intersection of each row and column is atomic (or single
valued). There can be only one value associated with each attribute on a specific
row of a table; no multivalued attributes are allowed in a relation.
3. Each row is unique; no two rows in a relation can be identical.
Properties of Relations

4. Each attribute (or column) within a table has a unique name.


5. The sequence of columns (left to right) is insignificant. The order of the
columns in a relation can be changed without changing the meaning or use of
the relation.
6. The sequence of rows (top to bottom) is insignificant. As with columns, the
order of the rows of a relation may be changed or stored in any sequence.
Removing Multivalued Attributes from Tables

● The second property of relations states that no multivalued attributes are


allowed in a relation.
● Thus, a table that contains one or more multivalued attributes is not a
Relation.
Removing Multivalued Attributes from Tables
Removing Multivalued Attributes from Tables
Relational Schema Representation

● There are two common methods for expressing a schema:


1. Short text statements, in which each relation is named and the names of its
attributes follow in parentheses.
2. A graphical representation, in which each relation is represented by a
rectangle containing the attributes for the relation.
Sample Database

A textual schema for four relations at Pine Valley Furniture Company is shown.
Sample Database

A graphical schema for four relations at Pine Valley Furniture Company is


shown.
Integrity Constraints

● The relational data model includes several types of constraints, or rules


limiting acceptable values and actions, whose purpose is to facilitate
maintaining the accuracy and integrity of data in the database.
● The major types of integrity constraints are domain constraints, entity
integrity, and referential integrity.
Domain Constraints

● All of the values that appear in a column of a relation must be from the
same domain.
● A domain is the set of values that may be assigned to an attribute.
● A domain definition usually consists of the following components: domain
name, meaning, data type, size (or length), and allowable values or
allowable range (if applicable).
Domain Constraints
Entity Integrity

● The entity integrity rule is designed to ensure that every relation has a
primary key and that the data values for that primary key are all valid.
● In particular, it guarantees that every primary key attribute is non-null.
● The entity integrity rule states the following:

No primary key attribute (or component of a primary key attribute) may be


null.
Referential Integrity

● A referential integrity constraint is a rule that maintains consistency among


the rows of two relations.
● The referential integrity rule states the following:

Either each foreign key value must match a primary key value in another
relation or the foreign key value must be null.
Referential Integrity

● The graphical version of the relational schema provides a simple technique for
identifying associations where referential integrity must be enforced.
● An arrow has to be drawn from each foreign key to the associated primary key.
● A referential integrity constraint must be defined for each of these arrows in the
schema using SQL.
Referential Integrity

How do you know whether a foreign key is allowed to be null?

● If each order must have a customer (a mandatory relationship), then the foreign
key CustomerID cannot be null in the ORDER relation.
● If the relationship is optional, then the foreign key could be null. Whether a
foreign key can be null must be specified as a property of the foreign key
attribute when the database is defined.
Well-Structured Relation

● A relation that contains minimal redundancy and allows users to insert,


modify, and delete the rows in a table without errors or inconsistencies.
● From the slide 30 and 31, identify which relation is well-structured relation?
Well-Structured Relation
Well-Structured Relation
Anomaly

● An error or inconsistency that may result when a user attempts to update a table
that contains redundant data.
● The three types of anomalies are insertion, deletion, and modification
anomalies.
Insertion Anomaly

● Suppose that we need to add a new employee to EMPLOYEE2.


● The primary key for this relation is the combination of EmpID and
CourseTitle.
● Therefore, to insert a new row, the user must supply values for both EmpID
and CourseTitle (because primary key values cannot be null or nonexistent).
● This is an anomaly because the user should be able to enter employee data
without supplying course data.
Deletion Anomaly

● Suppose that the data for employee number 140 are deleted from the table.
● This will result in losing the information that this employee completed a course
(Tax Acc) on 12/8/2015.
● In fact, it results in losing the information that this course had an offering that
completed on that date
Modification Anomaly

● Suppose that employee number 100 gets a salary increase.


● We must record the increase in each of the rows for that employee (two
occurrences); otherwise, the data will be inconsistent.
Well-Structured Relation

● These anomalies indicate that EMPLOYEE2 is not a well-structured relation.


● The problem with this relation is that it contains data about two entities:
EMPLOYEE and COURSE.
● Divide EMPLOYEE2 into two relations. One of the resulting relations is
EMPLOYEE1. The other we will call EMP COURSE.
Well-Structured Relation

Table: EMP COURSE


Transforming ER/EER Diagrams
to Relations
Well-Structured Relation

● During logical design, you transform the E-R (and EER) diagrams that were
developed during conceptual design into relational database schemas.
● The inputs to this process are the entity-relationship (and enhanced E-R)
diagrams.
● The outputs are the relational schemas.
● In the upcoming slides, we discuss the steps in the transformation of E-R (and
EER) diagram into relational schema.
Step 1: Mapping Regular Entities

● Each regular entity type in an E-R diagram is transformed into a relation.


● The name given to the relation is generally the same as the entity type.
● Each simple attribute of the entity type becomes an attribute of the relation.
● The identifier of the entity type becomes the primary key of the corresponding
relation.
Step 1: Mapping Regular Entities
Step 1: Mapping Regular Entities

● Composite Attribute: When a regular entity type has a composite attribute,


only the simple components of the composite attribute are included in the new
relation as its attributes.
Step 1: Mapping Regular Entities
Step 1: Mapping Regular Entities

● Multivalued Attribute: When the regular entity type contains a multivalued


attribute, two new relations (rather than one) are created.
● The first relation contains all of the attributes of the entity type except the
multivalued attribute. The second relation contains two attributes that form
the primary key of the second relation.
Step 1: Mapping Regular Entities

● The first of these attributes is the primary key from the first relation, which
becomes a foreign key in the second relation. The second is the multivalued
attribute.
● The name of the second relation should capture the meaning of the
multivalued attribute.
● If an entity type contains multiple multivalued attributes, each of them will be
converted to a separate relation.
Step 1: Mapping Regular Entities
Step 2: Mapping Weak Entities

● Recall that a weak entity type does not have an independent existence but
exists only through an identifying relationship with another entity type called
the owner.
● A weak entity type does not have a complete identifier but must have an
attribute called a partial identifier that permits distinguishing the various
occurrences of the weak entity for each owner entity instance.
Step 2: Mapping Weak Entities

● For each weak entity type, create a new relation and include all of the simple
attributes (or simple components of composite attributes) as attributes of this
relation.
● Then include the primary key of the identifying relation as a foreign key
attribute in this new relation.
● The primary key of the new relation is the combination of this primary key of
the identifying relation and the partial identifier of the weak entity type.
Step 2: Mapping Weak Entities

● In practice, an alternative approach is often used to simplify the primary key


of the DEPENDENT relation.
● Create a new attribute (called DependentID), which will be used as a
surrogate primary key.
● DependentID is simply a serial number that is assigned to each dependent of
an employee.
When to Create Surrogate Key

1. There is a composite primary key, as in the case of the DEPENDENT


relation shown previously with the four-component primary key.
2. The natural primary key (i.e., the key used in the organization and
recognized in conceptual data modeling as the identifier) is inefficient. For
example, it may be very long and hence costly for database software to
handle if it is used as a foreign key that references other tables.
3. The natural primary key is recycled (i.e., the key is reused or repeated
periodically, so it may not actually be unique over time); a more general
statement of this condition is when the natural primary key cannot, in fact, be
guaranteed to be unique over time (e.g., there could be duplicates, such as
with names or titles).
Mapping Binary Relations

● The procedure for representing relationships depends on both the degree of


the relationships (unary, binary, or ternary) and the cardinalities of the
relationships.
Map Binary One-to-Many Relations

● Include the primary key attribute (or attributes) of the entity on the one-side
of the relationship as a foreign key in the relation that is on the many-side of
the relationship.
Map Binary One-to-One Relations

● In a 1:1 relationship, the association in one direction is nearly always an


optional one, whereas the association in the other direction is a mandatory
one.
● You should include in the relation on the optional side of the relationship a
foreign key referencing the primary key of the entity type that has the
mandatory participation in the 1:1 relationship.
● Any nonkey attributes that are associated with the 1:1 relationship are
included with the relation containing the foreign key.
Map Binary Many-to-Many Relations

● Suppose that there is a binary many-to-many (M:N) relationship between two


entity types, A and B.
● For such a relationship, create a new relation, C. Include as foreign key
attributes in C the primary key for each of the two participating entity types.
These attributes together become the primary key of C.
● Any nonkey attributes that are associated with the M:N relationship are
included with the relation C.
Mapping Associative Entities

● The first step is to create three relations: one for each of the two participating
entity types and a third for the associative entity. We refer to the relation
formed from the associative entity as the associative relation.
● The second step then depends on whether on the E-R diagram an identifier
was assigned to the associative entity
Mapping Associative Entities

● Identifier not Assigned: If an identifier was not assigned, the default primary
key for the associative relation is a composite key that consists of the two
primary key attributes from the other two relations. These attributes are then
foreign keys that reference the other two relations
Mapping Associative Entities

● Identifier Assigned: Sometimes a data modeler will assign a single-attribute


identifier to the associative entity type on the E-R diagram. Two reasons may
have motivated the data modeler to assign a single-attribute key during
conceptual data modeling:
1. The associative entity type has a natural single-attribute identifier that is
familiar to end users.
2. The default identifier (consisting of the identifiers for each of the
participating entity types) may not uniquely identify instances of the
associative entity.
Step 5: Mapping Unary Relationships

● A unary relationship as a relationship between the instances of a single entity


type. Unary relationships are also called recursive relationships.
● The two most important cases of unary relationships are one-to-many and
many-to-many relationships
Map Unary One-to-Many Relations

● First convert the entity into a relational schema.


● A foreign key attribute is added to the same relation; this attribute references
the primary key values in the same relation. This type of a foreign key is
called a recursive foreign key.
Map Unary One-to-Many Relations
Map Unary Many-to-Many Relations

● With this type of relationship, two relations are created: one to represent the
entity type in the relationship and an associative relation to represent the M:N
relationship itself.
● The primary key of the associative relation consists of two attributes. These
attributes (which need not have the same name) both take their values from
the primary key of the other relation.
● Any non-key attribute of the relationship is included in the associative
relation.
Step 6: Map Ternary Relationships

● A ternary relationship is a relationship among three entity types. Convert a


ternary relationship to an associative entity to represent participation
constraints more accurately.
● To map an associative entity type that links three regular entity types, we
create a new associative relation. The default primary key of this relation
consists of the three primary key attributes for the participating entity types.
● In some cases, additional attributes are required to form a unique primary
key.
● Any attributes of the associative entity type become attributes of the new
relation.
Step 6: Map Ternary Relationships
Step 6: Map Ternary Relationships
Step 7: Map Supertype/Subtype Relationships

● Create a separate relation for the supertype and for each of its subtypes.
● Assign to the relation created for the supertype the attributes that are common
to all members of the supertype, including the primary key.
● Assign to the relation for each subtype the primary key of the supertype and
only those attributes that are unique to that subtype.
● Assign one (or more) attributes of the supertype to function as the subtype
discriminator.
DIY: Convert ERD to Relational
Schema

Click Here to access ERD


Normalization

● Following the steps outlined previously for transforming EER diagrams into
relations typically results in well-structured relations.
● However, there is no guarantee that all anomalies are removed by following
these steps.
● Normalization is a formal process for deciding which attributes should be
grouped together in a relation so that all anomalies are removed.
Keys in Database

● Super Key
● Candidate Key
● Primary Key
● Composite Key

The types of keys present in each table can help identify dependencies that are
either beneficial or detrimental to the normalization of a database.
Dependency

● A dependency is a constraint that governs or defines the relationship


between two or more attributes.
● In a database, it happens when information recorded in the same table
uniquely determines other information stored in the same table.
● This may also be described as a relationship in which knowing the value of
one attribute (or collection of attributes) in the same table tells you the value
of another attribute (or set of attributes).
Functional Dependency

● If the information stored in a table can uniquely determine another


information in the same table, then it is called Functional Dependency.
● If P functionally determines Q, then
P→Q
Functional Dependency

Table: Employee
EmpID EmpName EmpAge
E01 Ali 28
E02 Asif 31
E03 Ali 24

In this table the following functional dependencies exist:


EmpID → EmpName, EmpAge
Partial Dependency

● If the value of a non-primary attribute can be defined using part of the


primary key then it is called a partial dependency.
● Partial dependency occurs when primary key is formed using more than one
attribute.
Partial Dependency

Table: Student_Result

roll_no sub_id sub_name sub_mark

1 121 Science 80

1 131 Math 65

2 131 Math 95

2 141 English 75
Partial Dependency

● Roll_no and Sub_id combined forms the primary key.


● The following dependencies exist:
roll_no, sub_id → sub_mark
sub_id → sub_name
● Here sub_name can be determined by sub_id which is part of primary key.
Hence this is partial dependency.
Transitive Dependency

● If the value of a non-primary attribute other than the candidate key can be
defined using another non-primary attribute then it is called a transitive
dependency.
● When any attribute does not require primary key and can easily get value
using another non-primary attribute then it is called as Transitive
Dependency.
● A →B , B → C then C is transitively dependent on A
Transitive Dependency

Student

roll_no name city zip-code

1 Ali Pak Pattan 411044

2 Jawad Karachi 400001

3 Uzair Pak Pattan 411044

4 Zeeshan Dera Ghazi Khan 110001


Transitive Dependency

● Here the primary key is roll_no.


● The following dependencies exist:
roll_no → name, city, zip_code
zip_code → city
● So here roll_no → zip_code and zip-code→ city eventually resulting into
roll_no →city. so we can find a non-primary attribute using another non-
primary attribute.
Normalization

● Normalization is the process of successively reducing relations with


anomalies to produce smaller, well-structured relations.
● Following are some of the main goals of normalization:
1. Minimize data redundancy, thereby avoiding anomalies and conserving
storage space.
2. Simplify the enforcement of referential integrity constraints.
3. Make it easier to maintain data (insert, update, and delete).
4. Provide a better design that is an improved representation of the real world
and a stronger basis for future growth.
Steps in Normalization

● Normalization can be accomplished and understood in stages, each of which


corresponds to a normal form.
● A normal form is a state of a relation that requires that certain rules
regarding relationships between attributes (or functional dependencies) are
satisfied.
Steps in Normalization

● First normal Form: Any multivalued attributes (also called repeating


groups) have been removed, so there is a single value (possibly null) at the
intersection of each row and column of the table.
● Second Normal Form: Any partial functional dependencies have been
removed (i.e., nonkey attributes are identified by the whole primary key).
● Third normal form: Any transitive dependencies have been removed (i.e.,
nonkey attributes are identified by only the primary key).
Sample View
Step 0: Represent the View in Tabular Form

● The first step (preliminary to normalization) is to represent the user view as


a single table, or relation, with the attributes recorded as column headings.
Sample data should be recorded in the rows of the table, including any
repeating groups that are present in the data.
Step 0: Represent the View in Tabular Form
Step 1: Convert to First Normal Form

A relation is in first normal form (1NF) if the following two constraints both
apply:
1. There are no repeating groups in the relation (thus, there is a single fact at
the intersection of each row and column of the table).
2. A primary key has been defined, which uniquely identifies each row in the
relation.
Step 1: Convert to First Normal Form

Remove Repeating Groups:


As you can see, the invoice data contain a repeating group for each product that
appears on a particular order. Thus, OrderID 1006 contains three repeating
groups, corresponding to the three products on that order.
Step 1: Convert to First Normal Form
Step 1: Convert to First Normal Form

Select the Primary Key:


There are four determinants in INVOICE, and their functional dependencies are
the following:
Step 1: Convert to First Normal Form

Select the Primary Key:


As you can see, the only candidate key for INVOICE is the composite key
consisting of the attributes OrderID and ProductID (because there is only one
row in the table for any combination of values for these attributes). Therefore,
OrderID and ProductID are underlined.
Step 1: Convert to First Normal Form
Anomalies in 1NF

● Although repeating groups have been removed, the data still contain
considerable redundancy.
● For example, CustomerID, CustomerName, and CustomerAddress for Value
Furniture are recorded in three rows (at least) in the table.
● As a result of these redundancies, manipulating the data in the table can lead
to anomalies such as the following:
Anomalies in 1NF

● Insertion anomaly:
1. With this table structure, the company is not able to introduce a new product
(say, Breakfast Table with ProductID 8) and add it to the database before it is
ordered the first time: No entries can be added to the table without both
ProductID and OrderID.
2. As another example, if a customer calls and requests another product be
added to his OrderID 1007, a new row must be inserted in which the order
date and all of the customer information must be repeated. This leads to data
replication and potential data entry errors (e.g., the customer name may be
entered as “Valley Furniture”).
Anomalies in 1NF

● Deletion anomaly:
If a customer calls and requests that the Dining Table be deleted from her
OrderID 1006, this row must be deleted from the relation, and we lose the
information concerning this item’s finish (Natural Ash) and price ($800.00).
Anomalies in 1NF

● Update Anomaly:
If Pine Valley Furniture (as part of a price adjustment) increases the price of
the Entertainment Center (ProductID 4) to $750.00, this change must be
recorded in all rows containing that item.
Functional Dependency diagram for INVOICE
Step 2: Convert to Second Normal Form

● A relation is in second normal form (2NF) if it is in first normal form and


contains no partial functional dependencies.
● As you can see, the following partial dependencies exist:
Step 2: Convert to Second Normal Form

● To convert a relation with partial dependencies to second normal form, the


following steps are required:

1. Create a new relation for each primary key attribute (or combination of
attributes) that is a determinant in a partial dependency. That attribute is the
primary key in the new relation.
2. Move the nonkey attributes that are only dependent on this primary key
attribute (or attributes) from the old relation to the new relation.
Step 2: Convert to Second Normal Form
Step 3: Convert to Third Normal Form

● A relation is in third normal form (3NF) if it is in second normal form and no


transitive dependencies exist.
● There are two transitive dependencies in the CUSTOMER ORDER relation:
Step 3: Convert to Third Normal Form

● Removing Transitive Dependencies: You can easily remove transitive


dependencies from a relation by means of a three-step procedure:
1. For each nonkey attribute (or set of attributes) that is a determinant in a
relation, create a new relation. That attribute (or set of attributes) becomes the
primary key of the new relation.
2. Move all of the attributes that are functionally dependent only on the primary
key of the new relation from the old to the new relation.
3. Leave the attribute that serves as a primary key in the new relation in the old
relation to serve as a foreign key that allows you to associate the two
relations.
Step 3: Convert to Third Normal Form
DIY-1
DIY-1

Figure shows the performance ranking for Motion Bank. Convert this user view to
a set of 3NF relations using step by step normalization. Assume the following:
1. A department can be located on multiple branches. A branch can have
multiple departments.
2. An employee works at one branch in one department.
DIY-2
DIY-2

Figure shows an invoice for an order. Your assignment is as follows:


1. Convert it into a relation and diagram the functional dependencies in the
relation.
2. In what normal form is this relation?
3. Decompose invoice into a set of 3NF relations using step by step process.
4. Draw a relational schema for your 3NF relations and show the referential
integrity constraints.
5. Draw your answer to part 4 in ERD form.

You might also like