CS 405G: Introduction To Database Systems: Instructor: Jinze Liu
CS 405G: Introduction To Database Systems: Instructor: Jinze Liu
Database Systems
Instructor: Jinze Liu
Fall 2009
Review
A data model is
a group of concepts for describing data.
05/17/21 2
Topics Next
More case study
Conversion of ER models to Schemas
05/17/21 3
ER Case study
Design a database representing cities, counties, and states
For states, record name and capital (city)
For counties, record name, area, and location (state)
For cities, record name, population, and location (county and
state)
Assume the following:
Names of states are unique
Names of counties are only unique within a state
Names of cities are only unique within a county
A city is always located in a single county
A county is always located in a single state
4
05/17/21
05/17/21 4
Case study : first design
County area information is repeated for every city in the
county
Redundancy is bad.
What else?
State capital should really be a city
Should “reference” entities through explicit relationships
name name
Cities In States
population capital
county_name
county_area
05/17/21 5
Case study : second design
Technically, nothing in this design could prevent a city in
state X from being the capital of another state Y, but oh
well…
name
Cities
population
In IsCapitalOf
name
Counties In States name
area
05/17/21 6
Database Design
05/17/21 7
A Relation is a Table
Attributes
(column
headers)
name manf
Tuples Winterbrew Pete’s
(rows) Bud Lite Anheuser-Busch Beers
8
Schemas
Relation schema = relation name + attributes, in order
(+ types of attributes).
Example: Beers(name, manf) or Beers(name: string, manf:
string)
9
Why Relations?
Very simple model.
Often matches how we think about data.
Abstract model that underlies SQL, the most important
database language today.
But SQL uses bags, while the relational model is a set-based
model.
10
From E/R Diagrams to Relations
Entity sets become relations with the same set of
attributes.
Relationships become relations whose attributes are only:
The keys of the connected entity sets.
Attributes of the relationship itself.
11
Entity Set -> Relation
name manf
Beers
12
Relationship -> Relation
13
Combining Relations
It is OK to combine the relation for an entity-set E with
the relation R for a many-one relationship from E to
another entity set.
Example: Drinkers(name, addr) and Favorite(drinker,
beer) combine to make Drinker1(name, addr, favBeer).
14
Risk with Many-Many Relationships
Combining Drinkers with Likes would be a mistake. It
leads to redundancy, as:
Redundancy
15
Handling Weak Entity Sets
Relation for a weak entity set must include attributes for
its complete key (including those belonging to other entity
sets), as well as its own, nonkey attributes.
A supporting (double-diamond) relationship is redundant
and yields no relation.
16
Example
name name
17
Example
name name
Hosts(hostName)
Logins(loginName, hostName, time)
At(loginName, hostName, hostName2)
18
A (Slightly) Formal Definition
A database is a collection of relations (or tables)
Each relation is identified by a name and a list of
attributes (or columns)
Each attribute has a name and a domain (or type)
Set-valued attributes not allowed
19
Schema versus instance
Schema (metadata)
Specification of how data is to be structured logically
Defined at set-up
Rarely changes
Instance
Content
Changes rapidly, but always conforms to the schema
Compare to type and objects of type in a programming
language
05/17/21 20
Example
Schema
Student (SID integer, name string, age integer, GPA float)
Course (CID string, title string)
Enroll (SID integer, CID integer)
Instance
{ h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...}
{ hCPS116, Intro. to Database Systemsi, ...}
{ h142, CPS116i, h142, CPS114i, ...}
05/17/21 21
Relational Integrity Constraints
Constraints are conditions that must hold on all valid
relation instances. There are four main types of
constraints:
1. Domain constraints
1. The value of an attribute must come from its domain
2. Key constraints
3. Entity integrity constraints
4. Referential integrity constraints
05/17/21 22
Primary Key Constraints
A set of fields is a candidate key for a relation if :
1. No two distinct tuples can have same values in all key fields,
and
2. This is not true for any subset of the key.
Part 2 false? A superkey.
If there’s >1 key for a relation, one of the keys is chosen (by
DBA) to be the primary key.
E.g., given a schema Student(sid: string, name: string,
gpa: float) we have:
sid is a key for Students. (What about name?) The set {sid,
gpa} is a superkey.
05/17/21 24
Entity Integrity
Entity Integrity: The primary key attributes PK of each
relation schema R in S cannot have null values in any
tuple of r(R).
Other attributes of R may be similarly constrained to disallow
null values, even though they are not members of the primary
key.
05/17/21 25
Foreign Keys, Referential Integrity
Foreign key : Set of fields in one relation that is used to
`refer’ to a tuple in another relation. (Must correspond
to primary key of the second relation.) Like a `logical
pointer’.
E.g. sid is a foreign key referring to Students:
Student(sid: string, name: string, gpa: float)
Enrolled(sid: string, cid: string, grade: string)
If all foreign key constraints are enforced, referential
integrity is achieved, i.e., no dangling references.
Can you name a data model w/o referential integrity?
Links in HTML!
Enrolled
sid cid grade Students
53666 Carnatic101 C sid name login age gpa
53666 Reggae203 B 53666 Jones jones@cs 18 3.4
53650 Topology112 A 53688 Smith smith@eecs 18 3.2
53666 History105 B 53650 Smith smith@math 19 3.8
Or, use NULL as the value for the foreign key in the
referencing tuple when the referenced tuple does not exist
05/17/21 28
In-Class Exercise
(Taken from Exercise 5.16)
Consider the following relations for a database that keeps track of
student enrollment in courses and the books adopted for each
course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign keys
for
29 this schema. Jinze Liu @ University of Kentucky 05/17/21
Other Types of Constraints
Semantic Integrity Constraints:
based on application semantics and cannot be expressed by
the model per se
e.g., “the max. no. of hours per employee for all projects he
or she works on is 56 hrs per week”
A constraint specification language may have to be used to
express these
SQL-99 allows triggers and ASSERTIONS to allow for
some of these
05/17/21 30
Update Operations on Relations
Update operations
INSERT a tuple.
DELETE a tuple.
MODIFY a tuple.
Constraints should not be violated in updates