0% found this document useful (0 votes)
18 views

Lecture 4

The document discusses the relational data model and relational database constraints. It covers topics like entity-relationship diagrams, subclasses and superclasses, specialization and generalization, constraints on specialization and generalization, and categories or union types.

Uploaded by

zomukoza
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Lecture 4

The document discusses the relational data model and relational database constraints. It covers topics like entity-relationship diagrams, subclasses and superclasses, specialization and generalization, constraints on specialization and generalization, and categories or union types.

Uploaded by

zomukoza
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

CSE221: Database Systems

Lecture 4: The Relational Data Model and


Relational Database Constraints
Professor Shaker El-Sappagh
[email protected]
Spring 2023
Outline
• EER diagram in a glance: Generalization/Specialization and union.
• Relational Model Concepts
• Relational Model Constraints and Relational Database Schemas
• Update Operations and Dealing with Constraint Violations
EER diagram
EER diagram

Subclasses and Superclasses (1)


• An entity type may have additional meaningful subgroupings of its entities
• Example: EMPLOYEE may be further grouped into:
• SECRETARY, ENGINEER, TECHNICIAN, …
• Based on the EMPLOYEE’s Job
• MANAGER
• EMPLOYEEs who are managers (the role they play)
• SALARIED_EMPLOYEE, HOURLY_EMPLOYEE
• Based on the EMPLOYEE’s method of pay
• EER diagrams extend ER diagrams to represent these additional subgroupings,
called subclasses or subtypes
EER diagram
Subclasses and Superclasses
EER diagram

Subclasses and Superclasses (2)


• Each of these subgroupings is a subset of EMPLOYEE entities
• Each is called a subclass of EMPLOYEE
• EMPLOYEE is the superclass for each of these subclasses
• These are called superclass/subclass relationships:
• EMPLOYEE/SECRETARY
• EMPLOYEE/TECHNICIAN
• EMPLOYEE/MANAGER
• …
EER diagram

Subclasses and Superclasses (3)


• These are also called IS-A relationships
• SECRETARY IS-A EMPLOYEE, TECHNICIAN IS-A EMPLOYEE, ….
• Note: An entity that is member of a subclass represents the same real-world
entity as some member of the superclass:
• The subclass member is the same entity in a distinct specific role
• An entity cannot exist in the database merely by being a member of a subclass; it
must also be a member of the superclass
• A member of the superclass can be optionally included as a member of any number
of its subclasses
EER diagram

Subclasses and Superclasses (4)


• Examples:
• A salaried employee who is also an engineer belongs to the two subclasses:
• ENGINEER, and
• SALARIED_EMPLOYEE
• A salaried employee who is also an engineering manager belongs to the three subclasses:
• MANAGER,
• ENGINEER, and
• SALARIED_EMPLOYEE
• It is not necessary that every entity in a superclass be a member of some subclass
EER diagram

Representing Specialization in EER Diagrams


EER diagram

Attribute Inheritance in Superclass / Subclass Relationships

• An entity that is member of a subclass inherits


• All attributes of the entity as a member of the superclass
• All relationships of the entity as a member of the superclass
• Example:
• In the previous slide, SECRETARY (as well as TECHNICIAN and ENGINEER)
inherit the attributes Name, SSN, …, from EMPLOYEE
• Every SECRETARY entity will have values for the inherited attributes
EER diagram

Specialization (1)
• Specialization is the process of defining a set of subclasses of a
superclass
• The set of subclasses is based upon some distinguishing
characteristics of the entities in the superclass
• Example: {SECRETARY, ENGINEER, TECHNICIAN} is a specialization of
EMPLOYEE based upon job type.
• Example: MANAGER is a specialization of EMPLOYEE based on the role
the employee plays
• May have several specializations of the same superclass
EER diagram

Specialization (2)
• Example: Another specialization of EMPLOYEE based on method of pay is
{SALARIED_EMPLOYEE, HOURLY_EMPLOYEE}.
• Attributes of a subclass are called specific or local attributes.
• For example, the attribute TypingSpeed of SECRETARY
• The subclass can also participate in specific relationship types.
• For example, a relationship BELONGS_TO of HOURLY_EMPLOYEE
EER diagram
Specialization (3)
EER diagram

Generalization

• Generalization is the reverse of the specialization process


• Several classes with common features are generalized into a superclass;
• original classes become its subclasses
• Example: CAR, TRUCK generalized into VEHICLE;
• both CAR, TRUCK become subclasses of the superclass VEHICLE.
• We can view {CAR, TRUCK} as a specialization of VEHICLE
• Alternatively, we can view VEHICLE as a generalization of CAR and TRUCK
EER diagram

Generalization (2)
EER diagram

Constraints on Specialization and Generalization (3)

• Two basic constraints can apply to a specialization/generalization:


• Disjointness Constraint:
• Completeness Constraint:
EER diagram

Constraints on Specialization and Generalization (4)

• Disjointness Constraint:
• Specifies that the subclasses of the specialization must be disjoint:
• an entity can be a member of at most one of the subclasses of the specialization
• Specified by d in EER diagram
• If not disjoint, specialization is overlapping:
• that is the same entity may be a member of more than one subclass of the specialization
• Specified by o in EER diagram
EER diagram

Constraints on Specialization and Generalization (5)

• Completeness (Exhaustiveness) Constraint:


• Total specifies that every entity in the superclass must be a member of some
subclass in the specialization/generalization
• Shown in EER diagrams by a double line
• Partial allows an entity not to belong to any of the subclasses
• Shown in EER diagrams by a single line
EER diagram

Constraints on Specialization and Generalization (6)

• Hence, we have four types of specialization/generalization:


• Disjoint, total
• Disjoint, partial
• Overlapping, total
• Overlapping, partial
• Note: Generalization usually is total because the superclass is derived
from the subclasses.
EER diagram

Example of disjoint partial Specialization


EER diagram

Example of overlapping total Specialization


EER diagram

Specialization/Generalization Hierarchies, Lattices & Shared


Subclasses (1)
• A subclass may itself have further subclasses specified on it
• forms a hierarchy or a lattice
• Hierarchy has a constraint that every subclass has only one superclass
(called single inheritance); this is basically a tree structure
• In a lattice, a subclass can be subclass of more than one superclass
(called multiple inheritance)
EER diagram
Shared Subclass “Engineering_Manager”
EER diagram

Specialization/Generalization Hierarchies, Lattices & Shared


Subclasses (2)
• In a lattice or hierarchy, a subclass inherits attributes not only of its direct
superclass, but also of all its predecessor superclasses
• A subclass with more than one superclass is called a shared subclass (multiple
inheritance)
• Can have:
• specialization hierarchies or lattices, or
• generalization hierarchies or lattices,
• depending on how they were derived
EER diagram

Specialization/Generalization Hierarchies, Lattices & Shared


Subclasses (3)
• In specialization, start with an entity type and then define subclasses
of the entity type by successive specialization
• called a top down conceptual refinement process
• In generalization, start with many entity types and generalize those
that have common properties
• Called a bottom up conceptual synthesis process
• In practice, a combination of both processes is usually employed
EER diagram
Specialization / Generalization Lattice Example (UNIVERSITY)
EER diagram

Categories (UNION TYPES) (1)


• Most superclass/subclass relationships we have seen thus far have a single superclass
• A shared subclass is a subclass in:
• more than one distinct superclass/subclass relationships
• each relationships has a single superclass
• shared subclass leads to multiple inheritance
• In some cases, we need to model a single superclass/subclass relationship with more
than one superclass
• Superclasses can represent different entity types
• Such a subclass is called a category or UNION TYPE
EER diagram

Categories (UNION TYPES) (2)


• Example: In a database for vehicle registration, a vehicle owner can be a PERSON,
a BANK (holding a lien on a vehicle) or a COMPANY.
• A category (UNION type) called OWNER is created to represent a subset of the union
of the three superclasses COMPANY, BANK, and PERSON
• A category member must exist in at least one (typically just one) of its superclasses
• Difference from shared subclass, which is a:
• subset of the intersection of its superclasses
• shared subclass member must exist in all of its superclasses
EER diagram
Two categories (UNION types): OWNER, REGISTERED_VEHICLE
Relational model
Relational Model Concepts
• The relational Model of Data is based on the concept of a
Relation
• The strength of the relational approach to data management comes
from the formal foundation provided by the theory of relations
• A Relation is a mathematical concept based on the ideas of sets
• The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in
the following paper:
• "A Relational Model for Large Shared Data Banks," Communications of the ACM,
June 1970
• The above paper caused a major revolution in the field of database
management and earned Dr. Codd the coveted ACM Turing Award
Informal Definitions

• Informally, a relation looks like a table of values.

• A relation typically contains a set of rows.

• The data elements in each row represent certain facts that correspond to a real-world
entity or relationship
• In the formal model, rows are called tuples

• Each column has a column header that gives an indication of the meaning of the data
items in that column
• In the formal model, the column header is called an attribute name (or just attribute)
Example of a Relation
Informal Definitions
• Key of a Relation:
• Each row has a value of a data item (or set of items) that uniquely
identifies that row in the table
• Called the key
• In the STUDENT table, SSN is the key

• Sometimes row-ids or sequential numbers are assigned as keys to identify


the rows in a table
• Called artificial key or surrogate key
Formal Definitions - Schema
• The Schema (or description) of a Relation:
• Denoted by R(A1, A2, .....An)
• R is the name of the relation
• The attributes of the relation are A1, A2, ..., An
• Example:
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
• CUSTOMER is the relation name
• Defined over the four attributes: Cust-id, Cust-name, Address, Phone#
• Each attribute has a domain or a set of valid values.
• For example, the domain of Cust-id is 6 digit numbers.
Formal Definitions - Tuple
• A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’)
• Each value is derived from an appropriate domain.
• A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for
example:
• <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
• This is called a 4-tuple as it has 4 values
• A tuple (row) in the CUSTOMER relation.
• A relation is a set of such tuples (rows)
Formal Definitions - Domain
• A domain has a logical definition:
• Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S.
• A domain also has a datatype or a format defined for it.
• The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit.
• Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as dd
mm,yyyy etc.

• The attribute name designates the role played by a domain in a relation:


• Used to interpret the meaning of the data elements corresponding to that attribute
• Example: The domain Date may be used to define two attributes named “Invoice-date” and
“Payment-date” with different meanings
Formal Definitions - State
• The relation state is a subset of the Cartesian product of the domains
of its attributes
• each domain contains the set of all possible values the attribute can take.
• Example: attribute Cust-name is defined over the domain of character
strings of maximum length 25
• dom(Cust-name) is varchar(25)
• The role these strings play in the CUSTOMER relation is that of the
name of a customer.
Formal Definitions - Summary
• Formally,
• Given R(A1, A2, .........., An)
• r(R)  dom (A1) X dom (A2) X ....X dom(An)
• R(A1, A2, …, An) is the schema of the relation
• R is the name of the relation
• A1, A2, …, An are the attributes of the relation
• r(R): a specific state (or "value" or “population”) of relation R – this is a set of
tuples (rows)
• r(R) = {t1, t2, …, tn} where each ti is an n-tuple
• ti = <v1, v2, …, vn> where each vj element-of dom(Aj)
Formal Definitions - Example
• Let R(A1, A2) be a relation schema:
• Let dom(A1) = {0,1}
• Let dom(A2) = {a,b,c}
• Then: dom(A1) X dom(A2) is all possible combinations:
{<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> }

• The relation state r(R)  dom(A1) X dom(A2)


• For example: r(R) could be {<0,a> , <0,b> , <1,c> }
• this is one possible state (or “population” or “extension”) r of the relation R, defined
over A1 and A2.
• It has three 2-tuples: <0,a> , <0,b> , <1,c>
Definition Summary
Informal Terms Formal Terms
Table Relation
Column Header Attribute
All possible Column Domain
Values
Row Tuple

Table Definition Schema of a Relation


Populated Table State of the Relation
Example – A relation STUDENT
Characteristics Of Relations
• Ordering of tuples in a relation r(R):
• The tuples are not considered to be ordered, even though they appear to be
in the tabular form.
• Ordering of attributes in a relation schema R (and of values within each tuple):
• We will consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2,
..., vn> to be ordered .
• (However, a more general alternative definition of relation does not require this
ordering. It includes both the name and the value for each of the attributes ).
• Example: t= { <name, “John” >, <SSN, 123456789> }
• This representation may be called as “self-describing”.
Same state as previous Figure (but with
different order of tuples)
Characteristics Of Relations
• Values in a tuple:
• All values are considered atomic (indivisible).
• Each value in a tuple must be from the domain of the attribute for that
column
• If tuple t = <v1, v2, …, vn> is a tuple (row) in the relation state r of R(A1, A2, …, An)
• Then each vi must be a value from dom(Ai)

• A special null value is used to represent values that are unknown or not
available or inapplicable in certain tuples.
Characteristics Of Relations
• Notation:
• We refer to component values of a tuple t by:
• t[Ai] or t.Ai
• This is the value vi of attribute Ai for tuple t
• Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of
attributes Au, Av, ..., Aw, respectively in t
CONSTRAINTS
Constraints determine which values are permissible and which are not in the
database.
They are of three main types:
1. Inherent or Implicit Constraints: These are based on the data model itself. (E.g.,
relational model does not allow a list as a value for any attribute)
2. Schema-based or Explicit Constraints: They are expressed in the schema by using
the facilities provided by the model. (E.g., max. cardinality ratio constraint in the ER
model)
3. Application based or semantic constraints: These are beyond the expressive power
of the model and must be specified and enforced by the application programs.
Relational Integrity Constraints
• Constraints are conditions that must hold on all valid relation states.
• There are three main types of (explicit schema-based) constraints that can be
expressed in the relational model:
• Key constraints
• Entity integrity constraints
• Referential integrity constraints
• Another schema-based constraint is the domain constraint
• Every value in a tuple must be from the domain of its attribute (or it could be null, if
allowed for that attribute)
Key Constraints
• Superkey of R:
• Is a set of attributes SK of R with the following condition:
• No two tuples in any valid relation state r(R) will have the same value for SK
• That is, for any distinct tuples t1 and t2 in r(R), t1[SK]  t2[SK]
• This condition must hold in any valid state r(R)
• Key can not be NULL.
• Key of R:
• A "minimal" superkey
• That is, a key is a superkey K such that removal of any attribute from K results in a set
of attributes that is not a superkey (does not possess the superkey uniqueness
property)

• A Key is a Superkey but not vice versa


Key Constraints (continued)
• Example: Consider the CAR relation schema:
• CAR(State, Reg#, SerialNo, Make, Model, Year)
• CAR has two keys:
• Key1 = {State, Reg#}
• Key2 = {SerialNo}
• Both are also superkeys of CAR
• {SerialNo, Make} is a superkey but not a key.
• In general:
• Any key is a superkey (but not vice versa)
• Any set of attributes that includes a key is a superkey
• A minimal superkey is also a key
Key Constraints (continued)
• If a relation has several candidate keys, one is chosen arbitrarily to be the primary
key.
• The primary key attributes are underlined.
• Example: Consider the CAR relation schema:
• CAR(State, Reg#, SerialNo, Make, Model, Year)
• We chose SerialNo as the primary key
• The primary key value is used to uniquely identify each tuple in a relation
• Provides the tuple identity
• Also, used to reference the tuple from another tuple
• General rule: Choose as primary key the smallest of the candidate keys (in terms of
size)
CAR table with two candidate keys – LicenseNumber chosen as
Primary Key
Relational Database Schema
• Relational Database Schema:
• A set S of relation schemas that belong to the same database.
• S is the name of the whole database schema
• S = {R1, R2, ..., Rn} and a set IC of integrity constraints.
• R1, R2, …, Rn are the names of the individual relation schemas within the
database S
• Following slide shows a COMPANY database schema with 6 relation
schemas
COMPANY Database Schema
Relational Database State
• A relational database state DB of S is a set of relation states DB = {r1, r2, ..., rm}
such that each ri is a state of Ri and such that the ri relation states satisfy the
integrity constraints specified in IC.
• A relational database state is sometimes called a relational database snapshot
or instance.
• We will not use the term instance since it also applies to single tuples.
• A database state that does not meet the constraints is an invalid state
Populated database state
• Each relation will have many tuples in its current relation state
• The relational database state is a union of all the individual relation states
• Whenever the database is changed, a new state arises
• Basic operations for changing the database:
• INSERT a new tuple in a relation
• DELETE an existing tuple from a relation
• MODIFY an attribute of an existing tuple
• Next slide (Fig. 5.6) shows an example state for the COMPANY database schema
shown in Fig. 5.5.
Populated database state for COMPANY
Entity Integrity
• Entity Integrity:
• The primary key attributes PK of each relation schema R in S cannot have null
values in any tuple of r(R).
• This is because primary key values are used to identify the individual tuples.
• t[PK]  null for any tuple t in r(R)
• If PK has several attributes, null is not allowed in any of these attributes
• Note: Other attributes of R may be constrained to disallow null values, even
though they are not members of the primary key.
Referential Integrity
• A constraint involving two relations
• The previous constraints involve a single relation.
• Used to specify a relationship among tuples in two relations:
• The referencing relation and the referenced relation.
Referential Integrity
• Tuples in the referencing relation R1 have attributes FK (called
foreign key attributes) that reference the primary key attributes PK of
the referenced relation R2.
• A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
• A referential integrity constraint can be displayed in a relational
database schema as a directed arc from R1.FK to R2.
Referential Integrity (or foreign key) Constraint
• Statement of the constraint
• The value in the foreign key column (or columns) FK of the the referencing
relation R1 can be either:
(1) a value of an existing primary key value of a corresponding primary key PK in the
referenced relation R2, or
(2) a null.
• In case (2), the FK in R1 should not be a part of its own primary key.
Referential Integrity Constraints for COMPANY database
Other Types of Constraints
• Semantic Integrity Constraints:
• based on application semantics and cannot be expressed by
the model per se
• Example: “the max. no. of hours per employee for all
projects he or she works on is 56 hrs per week”
• A constraint specification language may have to be used to
express these
• SQL-99 allows CREATE TRIGGER and CREATE ASSERTION to
express some of these semantic constraints
• Keys, Permissibility of Null values, Candidate Keys (Unique in
SQL), Foreign Keys, Referential Integrity etc. are expressed by
the CREATE TABLE statement in SQL.
Update Operations on Relations
• INSERT a tuple.
• DELETE a tuple.
• MODIFY a tuple.
• Integrity constraints should not be violated by the update operations.
• Several update operations may have to be grouped together.
• Updates may propagate to cause other updates automatically. This
may be necessary to maintain integrity constraints.
Update Operations on Relations
• In case of integrity violation, several actions can be taken:
• Cancel the operation that causes the violation (RESTRICT or REJECT option)
• Perform the operation but inform the user of the violation
• Trigger additional updates so the violation is corrected (CASCADE option, SET
NULL option)
• Execute a user-specified error-correction routine

Slide 5- 70
Possible violations for each operation
• INSERT may violate any of the constraints:
• Domain constraint:
• if one of the attribute values provided for the new tuple is not of the specified attribute
domain
• Key constraint:
• if the value of a key attribute in the new tuple already exists in another tuple in the
relation
• Referential integrity:
• if a foreign key value in the new tuple references a primary key value that does not exist
in the referenced relation
• Entity integrity:
• if the primary key value is null in the new tuple
Possible violations for each operation
• DELETE may violate only referential integrity:
• If the primary key value of the tuple being deleted is referenced from other tuples in
the database
• Can be remedied by several actions: RESTRICT, CASCADE, SET NULL (see Chapter 6 for
more details)
• RESTRICT option: reject the deletion
• CASCADE option: propagate the new primary key value into the foreign keys of the
referencing tuples
• SET NULL option: set the foreign keys of the referencing tuples to NULL
• One of the above options must be specified during database design for each foreign
key constraint
Possible violations for each operation
• UPDATE may violate domain constraint and NOT NULL constraint on an attribute
being modified
• Any of the other constraints may also be violated, depending on the attribute
being updated:
• Updating the primary key (PK):
• Similar to a DELETE followed by an INSERT
• Need to specify similar options to DELETE
• Updating a foreign key (FK):
• May violate referential integrity
• Updating an ordinary attribute (neither PK nor FK):
• Can only violate domain constraints
How to model inheritance in relational database
• Three possible strategies to convert the
entities connected by inheritance in the logical
model into the tables of the physical model.
These strategies are:

• One table per inheritance hierarchy.


• One table per entity.
• One table per entity with all attributes.
How to model inheritance in relational database
1. One table per inheritance
hierarchy:
• Only one table to represent all the
inheritance hierarchy.
• This table will have one column for
each attribute in the hierarchy.
• All the attributes from any entity in the
inheritance will be in the generated
table.
How to model inheritance in relational database
1. One table per inheritance hierarchy:

• Table vehicle has columns taken from four different entities in the inheritance model:
vehicle, car, bike, and electrical_bike
• Generated inheritance table have an additional column called discriminator, which will be
used to identify what type of entity in the hierarchy the record is representing
• Many columns are nullable
How to model inheritance in relational database
2. One table per entity

• We have one table per entity in the inheritance hierarchy.


• We have foreign keys between every parent and child entity connected by inheritance.
• The discriminator column is not needed; different objects (cars, bikes) are stored in
different tables.
• The relationship is only between the car_owner and car tables.
How to model inheritance in relational database
3. One Table Per Entity
with All Attributes

• Create one table per entity in the inheritance hierarchy,


but each table will have all the attributes of its parent
entities.
• Note: the composition of the primary keys in the tables
car and bike. For example, the bike primary key is
formed by the columns bike_id and vehicle_id.
How to model inheritance in relational database
3. One Table Per Entity
with All Attributes

• Note the duplication of


data.
Thank you

You might also like