0% found this document useful (0 votes)
22 views40 pages

Lecture 02 Relational Model

Uploaded by

mamsohag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views40 pages

Lecture 02 Relational Model

Uploaded by

mamsohag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

The Relational Data Model


(ALL the Vocabulary)

Lecture 2

The Relational Data Model

January 9, 2018 1
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

A Quick Reminder
• One of the key features of a DBMS is use of data
models to support “data independence”
– The conceptual representation is independent of
underlying storage and/or operation implementation

ER Diagrams

Relations

The Relational Data Model

January 9, 2018 2
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Outline
1. Model Concepts
2. Model Constraints
3. Data Modification and Constraint Violation
4. Transactions

The Relational Data Model

January 9, 2018 3
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

The Relational Model


Codd, Edgar F. "A relational model of data for large shared data
banks." Communications of the ACM 13.6 (1970): 377-387.

“Future users of large data banks must be


protected from having to know how the data is
organized in the machine (the internal
representation)… Activities of users at
terminals and most application programs
should remain unaffected when the internal
representation of data is changed and even
when some aspects of the external
representation are changed…”

The Relational Data Model

January 9, 2018 4
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Motivation
• A formal mathematical basis for databases
– Set theory and first-order predicate logic
– Allows scientists to advance theoretically

• A foundation for efficient and usable database


management systems
– Allows companies/developers to advance end-
user products

• Note: some aspects of the model are not


adhered to by modern RDBMSs

The Relational Data Model

January 9, 2018 5
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Relational Database
A database consists of… Pop Quiz:
i. a set of relations (tables) What is a set?
ii. a set of integrity constraints

A database is in a valid state if it satisfies all


integrity constraints (else invalid state)

The Relational Data Model

January 9, 2018 6
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

A Relation
A relation consists of…
i. its schema, describing structure
ii. its state, or current populated data

STUDENT
Schema
Name SSN Phone Address Age GPA
Ben Bayer 305-61-2435 555-1234 1 Foo Lane 19 3.21
State Chung-cha Kim 422-11-2320 555-9876 2 Bar Court 25 3.53
Barbara Benson 533-69-1238 555-6758 3 Baz Blvd 19 3.25

The Relational Data Model

January 9, 2018 7
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Relational Schema
• Relation name
STUDENT

• Ordered list of n attributes (columns; degree n or n-ary)


Each with a corresponding domain (set of valid atomic values)
– dom(SSN) = “###-##-####”
– dom(GPA) = [0, 4]

• Notation: NAME(A1, A2, … An)


STUDENT(Name, SSN, Phone, Address, Age, GPA)

What is the degree


of STUDENT?
STUDENT
Name SSN Phone Address Age GPA

The Relational Data Model

January 9, 2018 8
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Relation State
• A set of n-tuples (rows)
– Each has a value in the domain of every
corresponding attribute (or NULL)
– Notation: r(NAME)

• Mathematically, a subset of the Cartesian product


of the attribute domains; related to the closed-
world assumption
r(ST U DEN T ) ✓ (dom(N ame) ⇥ dom(SSN ) ⇥ . . . dom(GP A))

Ben Bayer 305-61-2435 555-1234 1 Foo Lane 19 3.21


Chung-cha Kim 422-11-2320 555-9876 2 Bar Court 25 3.53
Barbara Benson 533-69-1238 555-6758 3 Baz Blvd 19 3.25

The Relational Data Model

January 9, 2018 9
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise
Diagrammatically produce a relation HAT according
to the following schema; the relation state should
have at least three tuples

HAT(Team, Size, Color)


• dom(Team) = { RedSox, Bruins, Celtics, Patriots,
Revolution }
• dom(Size) = { S, M, L, XL }
• dom(Color) = { Black, Blue, White, Red, Green,
Yellow }

How many tuples are possible in this relation?

The Relational Data Model

January 9, 2018 10
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Answer
HAT
Team Size Color
RedSox M Red
Revolution S White
Bruins XL Yellow

|dom(T eam)| ⇥ |dom(Size)| ⇥ |dom(Color)|


5⇥4⇥6
120
The Relational Data Model

January 9, 2018 11
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Tuples: Theory vs. Implementation


• Relation state is formally defined as a set of
tuples, implying…
– No inherent order
– No duplicates

• In real database systems, the rows on disk will


have an ordering, but the relation definition sets
no preference as to this ordering
– We will discuss later in physical design how to
establish an ordering to improve query efficiency

• Additionally, real database systems implement a


bag of tuples, allowing duplicate rows
The Relational Data Model

January 9, 2018 12
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

NULL
• NULL is a special value
that may be in the
attribute domain

• Several possible
meanings
– E.g. unknown, not
available, does not
apply, undefined, …

• Best to avoid
– Else deal with caution

The Relational Data Model

January 9, 2018 13
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Value Structure in Tuples


• Each value should be atomic – no
composite or multi-valued attributes
– Composite: “one column, many parts”
– Multi-valued: “one column, multiple values”

• Convention called 1NF (first normal form)


– More on this later in the course

The Relational Data Model

January 9, 2018 14
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Violation of 1NF: Composite

DORM Dorm Info DORM


SSN Dorm Room SSN Dorm Room
305-61-2435 555 Huntington 1 305-61-2435 555 Huntington 1
vs.
422-11-2320 Baker 2 422-11-2320 Baker 2
533-69-1238 555 Huntington 3 533-69-1238 555 Huntington 3

The Relational Data Model

January 9, 2018 15
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Violation of 1NF: Multi-Valued

CLASS CLASS
SSN Class SSN Class
305-61-2435 COMP355, 305-61-2435 COMP355
MATH 650
vs. 422-11-2320 COMP355
422-11-2320 COMP355,
BIOL110 533-69-1238 MATH650

533-69-1238 MATH650 305-61-2435 MATH650


422-11-2320 BIOL110

The Relational Data Model

January 9, 2018 16
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Model Constraints
Categories of restrictions on data in a
relational database

1. Inherent in the data model (implicit)


2. Schema-based (explicit)
3. Application-based (or triggers/assertions)
4. Data dependencies
Relates to “goodness” of database design;
we will revisit in normalization

The Relational Data Model

January 9, 2018 17
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Schema-Based Constraints
Can be directly expressed in schemas of
the data model, typically by specifying them
in the DDL (Data Definition Language)

• Domain
• Key
• Entity integrity
• Referential integrity

The Relational Data Model

January 9, 2018 18
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Domain Constraints
Within each tuple, the value of each attribute A
must be an atomic value from the domain
dom(A)

Schema must dictate whether or not a NULL


value is allowed for each attribute
?
N U LL 2 dom(A)

More later on standard data types in SQL


The Relational Data Model

January 9, 2018 19
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Key Constraints
A key is a set of attribute(s) satisfying two properties:

1. Two distinct tuples in any state of the relation cannot


have identical values for all the attributes of the key
(superkey)

2. No attribute can be removed from the key and still have


#1 hold (minimal superkey)

A relation may have multiple keys (each is a candidate


key). Relations commonly have a primary key
(underlined, PK; typically small number of attributes,
used to identify tuples), and may also have some
number of additional unique key(s).

The Relational Data Model

January 9, 2018 20
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise
Is the following a valid state of DOCTOR?
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker
7 Sylvester McCoy
8 Paul McGann

The Relational Data Model

January 9, 2018 21
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Answer
Is the following a valid state of DOCTOR?
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker
Underline = primary key
7 Sylvester McCoy Req #1: Two distinct tuples cannot have
8 Paul McGann identical values for all the attributes of the
key – NOT TRUE!
The Relational Data Model

January 9, 2018 22
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise
List all candidate key(s) for the current state of
DOCTOR.
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker
7 Sylvester McCoy
8 Paul McGann

The Relational Data Model

January 9, 2018 23
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Answer
List all candidate key(s) for the current state of
DOCTOR.
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker Candidate Key #1: { Number }
7 Sylvester McCoy Candidate Key #2: { First, Last }
8 Paul McGann
Why not { Last }, { Number, Last }?

The Relational Data Model

January 9, 2018 24
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Entity Integrity
In a tuple, no attribute that is part of the PK
can be NULL

Basic justification: if PK is used to identify a


tuple, then none of its component parts can
be left unknown

The Relational Data Model

January 9, 2018 25
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise
List all candidate key(s) for the current state of
DOCTOR.
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker 14 NULL NULL
7 Sylvester McCoy
8 Paul McGann

The Relational Data Model

January 9, 2018 26
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Answer
List all candidate key(s) for the current state of
DOCTOR.
DOCTOR
Number First Last Number First Last
1 William Hartnell 9 Christopher Eccleston
2 Patrick Troughton 10 David Tennant
3 Jon Pertwee 11 Matt Smith
4 Tom Baker 12 Peter Capaldi
5 Peter Davison 13 Jodie Whittaker
6 Colin Baker 14 NULL NULL
7 Sylvester McCoy PK = { Number }
8 Paul McGann

The Relational Data Model

January 9, 2018 27
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Referential Integrity
All tuples in relation R1 must reference an existing
tuple in relation R2 (R1 may be the same as R2)

A foreign key (FK) in R1 references R2 iff…


• The attribute(s) in FK have the same domain(s) as
the primary key attribute(s) PK of R2
• A value of FK in a tuple t1 either is NULL or
occurs as a value of PK for some tuple t2
(t1 refers to t2)
R1 R2
PK1 Stuffs FK PK2 Things

The Relational Data Model

January 9, 2018 28
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Example
STUDENT
Name SSN Phone Address Age GPA BFF

DORM STUDENTOFTHEYEAR
SSN Dorm Year SSN

CLASS
SSN Class

The Relational Data Model

January 9, 2018 29
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise

Given the above relational schema, for which


attribute(s) that refer to STUDENT(SSN), if any,
is it permissible to have a value of NULL?

The Relational Data Model

January 9, 2018 30
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Answer

Given the above relational schema, for which


attribute(s) that refer to STUDENT(SSN), if any,
is it permissible to have a value of NULL?

The Relational Data Model

January 9, 2018 31
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Chinook

The Relational Data Model

January 9, 2018 32
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Data Modification Operations


The DML (Data Manipulation Language) affords us the
following methods of modifying database state:

• Insert. Add a new tuple to a relation

• Delete. Remove a tuple from a relation

• Update. Change one or more attribute value(s) for a


tuple within a relation

We now examine how these operations can violate


various types of constraints and the resulting actions
that can be taken

The Relational Data Model

January 9, 2018 33
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Insert
Domain
• An attribute value does not appear in the corresponding domain
(including NULL)

Key
• A key value already exists in another tuple

Entity Integrity
• Any part of the primary key is NULL

Referential Integrity
• Any value of any foreign key refers to a tuple that does not exist in
the referenced relation

Typical action: reject insertion

The Relational Data Model

January 9, 2018 34
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Delete
Referential Integrity
• Tuple being deleted is referenced by
foreign keys from other tuples

Possible actions
• Reject deletion
• Cascade (propagate deletion)
• Set default/NULL referencing attribute
values (careful with primary key)
The Relational Data Model

January 9, 2018 35
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Update
• If modifying neither part of primary key nor foreign
key, need only check…
– Domain

• Modifying primary key…


– Like Delete then Insert

• Modifying foreign key…


– Like Insert

Actions typically similar to Delete with separate


options.

The Relational Data Model

January 9, 2018 36
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Transactions
A transaction is a sequence of database
operations, including retrieval and update(s)

START
Read or write
Read or write
Read or write

COMMIT or ROLLBACK
The Relational Data Model

January 9, 2018 37
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Desirable Properties of Transactions

A
tomicity. A transaction is an atomic unit of processing; it should either
be performed in its entirety or not performed at all.

C
onsistency. A transaction should be consistency preserving, meaning
that if it is completely executed from beginning to end without
interference from other transactions, it should take the database from
one consistent state to another.

I
solation. A transaction should appear as though it is being executed in
isolation from other transactions, even though many transactions are
executing concurrently. That is, the execution of a transaction should not
be interfered with by any other transactions executing concurrently.

D
urability. The changes applied to the database by a committed
transaction must persist in the database. These changes must not be
lost because of any failure.

The Relational Data Model

January 9, 2018 38
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Exercise
1. For a balanced budget, incoming funds must always equal
outgoing payments at the end of the year
Consistency

2. With a RAID 5 setup, a server can survive the loss of any


single hard drive by combining data on the remaining disks
Durability

3. If there is an error in printing a picture at the photo booth,


the customer should be refunded
Atomicity

4. Do not publish results while the jury is out


Isolation

The Relational Data Model

January 9, 2018 39
CS3200 – Database Design・ ・・ Spring 2018・ ・・ Derbinsky

Summary
• The relational model dictates that a relational database consists of
(i) a set of relations and (ii) a set of integrity constraints
– All constraints met => database in a valid state

• A relation is composed of its schema (name; list of n attributes,


each with its domain) and its state/data (set of n-tuples)

• Schema (or explicit) constraints, specified via DDL, include


domain, key, entity integrity, and referential integrity
– Data manipulation operations (insert, update, delete; via DML) can run
awry of these constraints

• A transaction is a sequence of operations and ACID-compliant


RDBMSs implement ”proper” transaction processing
– Atomicity, Consistency, Isolation, Durability

The Relational Data Model

January 9, 2018 40

You might also like