0% found this document useful (0 votes)
0 views

Lecture2-ER_Design

The document outlines the principles of relational database design, focusing on the Entity-Relationship (ER) model, which helps in sketching database schemas through diagrams. It covers key concepts such as entities, attributes, relationships, and keys, along with their representations and constraints. Additionally, it discusses the importance of avoiding redundancy and the use of weak entity sets in database design.

Uploaded by

dogiathuyasd18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Lecture2-ER_Design

The document outlines the principles of relational database design, focusing on the Entity-Relationship (ER) model, which helps in sketching database schemas through diagrams. It covers key concepts such as entities, attributes, relationships, and keys, along with their representations and constraints. Additionally, it discusses the importance of avoiding redundancy and the use of weak entity sets in database design.

Uploaded by

dogiathuyasd18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

▪ We start @ 14:15

▪ Remember to join MS Teams – code: 6esizxc


Instructor: Krystian Wojtkiewicz
School of Computer Science and Engineering
International University, VNU-HCMC Lecture 2: Relational
Database Design
Entity-Relationship
The following
slides are
ACKNOWLEDGEMENT referenced from
Dr. Sudeepa Roy,
Duke University.
A database is a Each relation has a Each attribute has a Each relation contains
collection of relations set of attributes (or name and a domain a set of tuples (or
(or tables) columns) (or type) rows)

How do we know which relations


and attributes to have?
Design ERD

o Notation
• Entity/ Weak entity
• Attributes
• Relationship

Reading material

• [GUW] Chapter 3
OPEN-SOURCE TOOLS
There are several open-source tools available for
drawing Entity-Relationship Diagrams (ERDs):
▪ draw.io

▪ Lucidchart

▪ dbdiagram.io

▪ MySQL Workbench

▪ Dia

▪ Pencil Project
A key is a set of attributes K for a relation R if
▪ In no instance of R will two different tuples agree on all
attributes of K
▪ That is, K can serve as a “tuple identifier”

▪ No proper subset of K satisfies the above condition.


▪ That is, K is minimal

▪ Example: Students (Sid, name, age, pop)


▪ Sid is a key
▪ Age is not a key (not an identifier)
▪ {sid, name} is not a key (not minimal)
name address
▪ Is name a key of Drinker?
Amy 100 W. Main ▪ Yes? Seems reasonable for this instance
Street ▪ No? Drinkers' names are not unique in general

▪ Key declarations are part of the schema


Ben 101 W. Main
Street
Dan 300 N. Duke
Street
▪ Enrolled (Sid, Cid)
▪ {Sid, Cid}: a key
▪ A key can contain multiple attributes

▪ Address(street_address, city, state, zip)


▪ {street_address, city, state}
▪ {street_address, zip}

A relation can have multiple keys!


▪ We typically pick one as the “primary” key, and underline
all its attributes, e.g., Address(street_address, city, state,
zip)
▪ More constraints on data, fewer mistakes

▪ Look up a row by its key value


▪ Many selection conditions are “key = value”

“Pointers” to other rows (often across tables)


▪ Example: Enrolled(Sid, Cid)
▪ Sid: is a key of Students
▪ Cid: is a key of Courses
▪ An Enrolled row “links” a Students row with a Courses
row.
▪ Many join conditions are “key = key value stored in another
table”
▪ Historically and still very popular

▪ Concepts applicable to other design models as


well
▪ Can think of as a “watered-down” object-oriented
design model
▪ Primarily a design model, i.e., not directed
implemented by DBMS
▪ Designs represented by E/R diagrams
▪ We use the style of E/R diagram covered by
the GMUW book; there are other
styles/extensions
▪ Very similar to UML diagrams
Entity

• a “thing” like an object

Entity set

• a collection of things of the same type, like a relation of


tuples or a class objects
• Represented as a rectangle

Relationship

E/R BASICS • an association among entities

Relationship set

• a set of relationships of the same type (among same


entity sets)
• Represented as a diamond

Attributes

• properties of entities or relationships, like attributes of


tuples or objects.
• Represented as ovals
Groups
Users
Each has gid (unique id),
Each has uid (unique id), name
name, age, pop (popularity)

Member
Records fromDate
(when a user joined a group)
PURPOSE OF E/R
MODEL
▪ The E/R model allows us to sketch
database schema designs.
▪ Includes some constraints, but not
operations.
▪ Designs are pictures called entity-
relationship diagrams.
▪ Later: convert E/R designs to relational DB
designs.
FRAMEWORK FOR E/R
▪ Design is a serious business.

▪ The “boss” knows they want a database,


but they don’t know what they want in it.
▪ Sketching the key components is an
efficient way to develop a working
database.
1 2 3 4 5 6 7

Entity Relationship Identifying and Determine the Determine key Cardinality Hierarchical
Identification Identification mapping value range for attribute for identification. design
(attributes or attributes to each attribute. each entity. (generalized/spe
entities) entities, cialized)
relationships. constraints.
Express the number of entities to which another entity can be associated
via a relationship set.

Most useful in describing binary relationship sets.

One to one
For a binary relationship set the
One to many
mapping cardinality must be one of Many to one
the following types: Many to many
One to one One to many

Note: Some elements in A and B may not be mapped to any elements in


the other set
Many to one Many to many
Note: Some elements in A and B may not be mapped to any elements in
the other set
Entity = “thing” or object.

Entity set = collection of Like a class in object-oriented


similar entities. languages.

Attribute = property of (the Attributes are simple values, e.g.,


integers or character strings, not
entities of) an entity set. structs, sets, etc.
In an entity-relationship diagram:
▪ Entity set = rectangle.
▪ Attribute = oval, with a line to the rectangle
representing its entity set.
▪ Entity set Beers has two attributes, name and manf
name manf (manufacturer).
▪ Each Beers entity has values for these two attributes,
e.g. (Bud, Anheuser-Busch)

Beers
A relationship connects two or more It is represented by a diamond, with lines
entity sets. to each of the entity sets involved.
name addr Bars sell some name manf
beers.

Bars Sells Beers

license

Drinkers frequent Frequents Likes Drinkers like


some bars. some beers.

Drinkers
name addr
Note: license = beer, full,
none
▪ The current “value” of an entity set is the set of entities
that belong to it.
▪ Example: the set of all bars in our database.

▪ The “value” of a relationship is a relationship set, a set


of tuples with one component for each related entity
set.
For the relationship Sells, we
might have a relationship set
like:

Beers
Bar Beer
Joe’s Bar Bud
Joe’s Bar Miller

Bars Sue’s Bar Bud


Sue’s Bar Pete’s Ale
Sue’s Bar Bud Lite
▪ Sometimes, we need a relationship that connects more
than two entity sets.
▪ Suppose that drinkers will only drink certain beers at
certain bars.
▪ Our three binary relationships Likes, Sells, and Frequents
do not allow us to make this distinction.
▪ But a 3-way relationship would.
name addr name manf

license Bars Beers

Preferences

Drinkers

name addr
Bar Drinker Beer
Joe’s Bar Ann Miller
Sue’s Bar Ann Bud
Sue’s Bar Ann Pete’s Ale
Joe’s Bar Bob Bud
Joe’s Bar Bob Miller
Joe’s Bar Cal Miller
Sue’s Bar Cal Bud Lite
Focus: binary relationships, such as
Sells between Bars and Beers.

In a many-many
relationship, an entity E.g., a bar sells
many beers; a
of either set can be beer is sold by
connected to many many bars.
entities of the other set.
many-many
Some binary relationships are
many-one from one entity set to
another.

Each entity of the first set is


connected to at most one entity
of the second set.

But an entity of the second set


can be connected to zero, one,
or many entities of the first set.
many-one
▪ Favorite, from Drinkers to Beers is many-one.

▪ A drinker has at most one favorite beer.

▪ But a beer can be the favorite of any number of drinkers,


including zero.
▪ In a one-one relationship, each entity of either
entity set is related to at most one entity of the
other set.
▪ Example: Relationship Best-seller between
entity sets Manfs (manufacturer) and Beers.
▪ A beer cannot be made by more than one
manufacturer, and no manufacturer can have
more than one best-seller (assume no ties).
one-one
▪ Show a many-one relationship by an
arrow entering the “one” side.
▪ Remember: Like a functional dependency.

REPRESENTING ▪ Show a one-one relationship by arrows


“MULTIPLICITY” entering both entity sets.
▪ Rounded arrow = “exactly one,” i.e.,
each entity of the first set is related to
exactly one entity of the target set.
Drinkers Likes Beers

Favorite

Notice: two relationships connect the same entity sets, but are different.
1 2 3
Consider Best-seller Some beers are not the But a beer manufacturer
between Manfs and best-seller of any has to have a best-seller.
Beers. manufacturer, so a
rounded arrow to Manfs
would be inappropriate.
Best-
Manfs Beers
seller

A beer is the A manufacturer


best- has exactly one
seller for 0 or 1 best seller.
manufacturer.
SOMETIMES IT IS USEFUL TO THINK OF THIS ATTRIBUTE AS A
ATTACH AN ATTRIBUTE TO A PROPERTY OF TUPLES IN THE
RELATIONSHIP. RELATIONSHIP SET.
Bars Sells Beers

EXAMPLE:
ATTRIBUTE ON price
RELATIONSHIP
Price is a function of both the bar and the beer,
not of one alone.
Create an entity set representing Make that entity set participate in
values of the attribute. the relationship.
Bars Sells Beers

Prices

price

Note convention: arrow from multiway relationship = “all other


entity sets together determine a unique one of these.”
▪ Sometimes an entity set appears more
than once in a relationship.
▪ Label the edges between the
relationship and the entity set with
names called roles.
Relationship Set

Husband wife

Bob Ann

Joe Sue

… …
Married

husband wife

Drinkers
Relationship Set

Buddy1 Buddy2
Bob Ann
Joe Sue
Buddies Ann Bob
Joe Moe
1 2 … …

Drinkers
Subclass = special case = fewer entities = more
properties.

Not every beer is an ale, but some are.


Example: Ales are a kind of Let us suppose that in addition to all the
beer. properties (attributes and relationships) of
beers, ales also have the attribute color.
SUBCLASSES IN E/R
DIAGRAMS
▪ Assume subclasses form a tree.
▪ i.e., no multiple inheritance.

▪ Isa triangles indicate the subclass


relationship.
▪ Point to the superclass.
name Beers manf

isa
EXAMPLE:
SUBCLASSES color Ales
Only a subset of entities within a type have
certain attributes or participate in certain
relationships

ID

STUDENT

GRAD_STUDENT Degrees
(o)verlap: may be more than one
(d)isjoint: entities may only be one subtype

A person can be an SSN


employee, an
alumnus, and/or a
PERSON
student
o

EMPLOYEE ALUMNUS STUDENT


(o)verlap: may be more than one
(d)isjoint: entities may only be one subtype
A person can be
SSN either an
employee, an
PERSON alumnus, or a
student
d

EMPLOYEE ALUMNUS STUDENT


Similar to relationships; can be total (must
belong to subtypes) or partial (can belong)

A person must be
SSN exactly one: an
employee, an
PERSON alumnus, or a
student
d

U
EMPLOYEE ALUMNUS STUDENT
In contrast, E/R entities have
In OO, objects are in one class
representatives in all subclasses
only.
to which they belong.
• Subclasses inherit from • Rule: if entity e is represented
superclasses. in a subclass, then e is
represented in the superclass
(and recursively up the tree).
name Beers manf

Pete’s Ale
isa

color Ales
Underline the key attribute(s).

In an isa hierarchy, only the root entity set has a key, and it
must serve as the key for all entities in the hierarchy.
name Beers manf

isa

color Ales
dept number hours room

Courses

• Note that hours and room could also serve as a


key, but we must select only one key.
Entity set E is said to be weak if
in order to identify entities of E
Occasionally, entities of an entity uniquely, we need to follow one
set need “help” to identify them or more many-one relationships
uniquely. from E and include the key of the
related entities from the
connected entity sets.
1 2 3
name is almost a key for number is certainly not a But number, together with
football players, but there key, since players on two the team name related to
might be two with the teams could have the the player by Plays-on
same name. same number. should be unique.
name number name

Plays-
Players Teams
on

IN E/R DIAGRAMS Note: must be rounded because


each player needs a team to
help with the key.

• Double diamond for supporting many-one relationship.


• Double rectangle for the weak entity set.
▪ A weak entity set has one or more many-one relationships to
other (supporting) entity sets.
▪ Not every many-one relationship from a weak entity set
need to be supporting.
▪ But supporting relationships must have a rounded arrow
(entity at the “one” end is guaranteed).
▪ The key for a weak entity set is its own underlined attributes
and the keys for the supporting entity sets.
▪ E.g., (player) number and (team) name is a key for Players
in the previous example.
1. Avoid redundancy.
2. Limit the use of weak entity sets.
3. Don’t use an entity set when an attribute will do.
▪ Redundancy = saying the same thing in two (or more)
different ways.
▪ Wastes space and (more importantly) encourages
inconsistency.
▪ Two representations of the same fact become inconsistent
if we change one and forget to change the other.
▪ Recall anomalies due to FD’s.
name name addr

Beers ManfBy Manfs

This design gives the address of each manufacturer exactly once.


name name addr

Beers ManfBy Manfs

manf

This design states the manufacturer of a beer twice: as an attribute and


as a related entity.
name manf manfAddr

Beers
EXAMPLE
This design repeats the manufacturer’s address once for each beer and
loses the address if there are temporarily no beers for a manufacturer.
ENTITY SETS VERSUS
ATTRIBUTES
An entity set should satisfy at least one of the
following conditions:
▪ It is more than the name of
something; it has at least one non-
key attribute.
or
▪ It is the “many” in a many-one or
many-many relationship.
name name addr

Beers ManfBy Manfs

EXAMPLE: GOOD
• Manfs deserves to be an entity set because of the non-key
attribute addr.
• Beers deserves to be an entity set because it is the “many” of the
many-one relationship ManfBy.
There is no need to make the manufacturer
an entity set, because we record nothing
about manufacturers besides their name.
name manf

Beers
name name

Beers ManfBy Manfs

Since the manufacturer is nothing but a name, and is not at the “many”
end of any relationship, it should not be an entity set.
Beginning database designers They make all entity sets weak,
often doubt that anything could be supported by all other entity sets to
a key by itself. which they are linked.

In reality, we usually create unique Examples include social-security


ID’s for entity sets. numbers, automobile license’s etc.
WHEN DO WE NEED
WEAK ENTITY SETS?
▪ The usual reason is that there is no global
authority capable of creating unique ID’s.
▪ Example: it is unlikely that there could be
an agreement to assign unique player
numbers across all football teams in the
world.
Draw an ERD for the following description:
You are in charge of managing the program committee for an
important conference and journal. The following database
stores information about papers submitted to the conference
and journal (Papers). Each Paper has a unique ID (PID), a
title, and a type of source. Reviewers on the program
committee (Reviewers: each reviewer has a unique ID (rid), a
full name including first name last name, and middle name, an
affiliation, and several emails). Each reviewer on the program
committee will have to check a set of papers (zero or more).
Each paper is assigned by zero or more reviewers, identify
the number of papers that have been checked by the amount
attribute.
Design a database representing Assume the following:
cities, counties, and states
For states, record name and capital (city) Names of states are unique
For counties, record name, area, and location (state) Names of counties are only unique within a state
For cities, record name, population, and location Names of cities are only unique within a county
(county and state) A city is always located in a single county
A county is always located in a single state
name name
Cities In States

population capital

county_name

county_area

▪ County area information is repeated for


every city in the county
▪ Redundancy is bad (why?)
▪ State capital should really be a city
▪ Should “reference” entities
through explicit
relationships
name
Cities
population

In IsCapitalOf

name
Counties States
In name
area

Technically, nothing in this design prevents a city in


state 𝑋 from being the capital of another state 𝑌 …
Design a database consistent with the following:
▪ A station has a unique name and an address, and is either an
express station or a local station
▪ A train has a unique number and an engineer, and is either an
express train or a local train
▪ A local train can stop at any station
▪ An express train only stops at express stations
▪ A train can stop at a station for any number of times during a day
▪ Train schedules are the same everyday
number name

engineer Trains StopsAt Stations address

E/L? time E/L?

▪ Nothing in this design prevents express trains from


stopping at local stations
▪ => We should capture as many constraints as
possible
▪ A train can stop at a station only once during a day
▪ => We should not introduce unintended
constraints
time
number name
Trains LocalTrainStops Stations
engineer address

ISA ISA
LocalTrains
LocalStations

ExpressTrains
ExpressStations
time No double-diamonds here
because train number + time
ExpressTrainStops
uniquely determine a stop

Is the extra complexity worth it?


Draw an ERD for the following
description:

Each department has a unique name, a


unique number, and a particular employee
who manages the department. We keep
track of the start date when that
employee began managing the
department. A department may have
several locations.
Or

An entity type DEPARTMENT with attributes Name,


Number, Locations, Manager, and
Manager_start_date. Locations is the only multivalued
attribute. We can specify that both Name and Number
are (separate) key attributes because each was
specified to be unique.
Draw an ERD for the following description:

A department controls a number of projects, each of which has a


unique name, a unique number, and a single location.
Or
An entity type PROJECT with attributes Name, Number, Location, and
Controlling_department. Both Name and Number are (separate) key
attributes.
Draw an ERD for the following description:

We store each employee’s name (first, last, MI), Social Security


number (SSN), street address, salary, sex (gender), and birth date. An
employee is assigned to one department, but may work on several
projects, which are not necessarily controlled by the same
department. We keep track of the current number of hours per week
that an employee works on each project. We also keep track of the
direct supervisor of each employee (who is another employee).
Or

An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address,


Salary, Birth_date, Department, and Supervisor. Both Name and
Address may be composite attributes; however, this was not specified
in the requirements. We must go back to the users to see if any of
them will refer to the individual components of Name First_name,
Middle_initial, Last_name or of Address. In this example Name is
modeled as a composite attribute, whereas Address is not,
presumably after consultation with the users.
Draw an ERD for the following description:

We want to keep track of the dependents of each


employee for insurance purposes. We keep each
dependent’s first name, sex, birth date, and relationship
to the employee.
EXERCISE Or
An entity type DEPENDENT with attributes Employee,
Dependent_name, Sex, Birth_date, and Relationship (to
the employee).
We store each employee’s name (first, last, MI), Social Security Salary
number (SSN), street address, salary, sex (gender), and birth SSN Sex
date. An employee is assigned to one department, but may work
on several projects, which are not necessarily controlled by the Department Birthdate
same department. We keep track of the current number of hours EMPLOYEE
Supervisor
per week that an employee works on each project. We also keep Address

track of the direct supervisor of each employee (who is another FName Name Works_On Project

employee).
MI LName Hours
Salary

SSN Sex Department

1
Supervisor Birthdate
Supervision

EMPLOYEE
Address
Supervisee
N
FName Name Works_On Project

MI LName Hours
Sex
REVISE!
Employee
DEPENDENT
DBirthdate
Relationship
DName
We want to keep track of the
dependents of each employee for
insurance purposes. We keep each
dependent’s first name, sex, birth
date, and relationship to the employee
Salary

SSN Sex Department


1
Supervisor
Supervision

Birthdate
EMPLOYEE
Address
Supervisee
N
FName Name Works_On Project

MI LName Hours
Salary

SSN Sex Department

1
Supervisor
Birthdate

Supervision
EMPLOYEE

Address
1
N

Supervisee FName Name Works_On Project

Sex
MI LName Hours

DEPENDENT N
DEPENDENT DBirthdate
_OF

Relationship

DName
Name Number

DEPARTMENT
Location Manager ▪ A department controls a number of
projects, each of which has a unique
name, a unique number, and a single
location.
Manager_
Start_Date

Name Number

Location PROJECT

Controlling_
Department
Name Number

PROJECT
Location
N Name Number

1 DEPARTMENT
CONTROLS Manager

Manager_
Location Start_Date
Name Number

Location PROJECT
Name Number
N

CONTROLS
1 DEPARTMENT
Manager
Each department has a particular employee who
manages the department.
Location Manager_
Start_Date
An employee is assigned to one department, but
may work on several projects, which are not
Supervisor
Salary necessarily controlled by the same department. We
Sex Departmen
SSN
t
keep track of the current number of hours per week
Supervision

1
Birthdate that an employee works on each project.
EMPLOYEE
Address
N 1
Supervisee Name Works_On Project

FName LName Sex


Hours
MI
N DEPENDENT
DEPENDENT DBirthdate
_OF
Relationship
DName
Supervisor
Supervisee
1
Hours Supervision N

M EMPLOYEE
WORKS_ON SSN

1 N
Name Start
_Date

N
PROJECT Manages
Location

Number
1
1 DEPARTMENT 1
CONTROLS Works_For

Name Number
Location
ALL TOGETHER
NOW!

You might also like