0% found this document useful (0 votes)
65 views

Lecture 04 The Relational Data Model

The document summarizes Ted Codd's relational data model, which forms the basis for modern relational databases. It proposes representing all data in relations (tables), with attributes (columns) and tuples (rows). Relations have a schema defining attributes and their domains (types), while the current database state is defined by the tuples. The model uses constraints like keys to ensure relationships between attributes and tuples. Relations provide a simple yet powerful way to represent complex relational data.

Uploaded by

Sharmainne Pale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Lecture 04 The Relational Data Model

The document summarizes Ted Codd's relational data model, which forms the basis for modern relational databases. It proposes representing all data in relations (tables), with attributes (columns) and tuples (rows). Relations have a schema defining attributes and their domains (types), while the current database state is defined by the tuples. The model uses constraints like keys to ensure relationships between attributes and tuples. Relations provide a simple yet powerful way to represent complex relational data.

Uploaded by

Sharmainne Pale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

IM 101

Fund. of Database Systems

Lecture 4 – Chapter 5
The Relational Data Model
Edgar F. Codd

Ted Codd proposed


the relational data model
in 1970.

He received
the ACM Turing Award
in 1981.

Communications of the ACM 13[6] June 1970


Relational Data Model
● Core of majority of modern databases

● Virtually all business relies


on some form of relational database

● Solid theoretical/mathematical foundation

● Simple but robust implementation


Models, Schemas and States
● A data model defines the constructs
available for defining a schema
○ defines possible schemas

● A schema defines the constructs


available for storing the data
○ defines database structure

○ limits the possible database states

● A database state (or instance) is all


the data at some point in time
⚫ the database content
Models, Schemas and States
● data model
○ fixed by the DBMS

● schema
○ defined by the DB designer
○ generally fixed once defined *

● database state
○ changes over time due to user updates

* schema modifications are possible once the database


is populated, but this generally causes difficulties
The Relational Data Model

● All data is stored in relations


○ relations are sets, but generally viewed as 2D tables

● DB schema = a set of relation specifications


○ the specification of a particular relation is called a relation schema

● DB state = the data stored in the relations


○ the data in a particular relation is called a relation state
(or relation instance or simply relation)

Principle of Uniform Representation:


The entire content of a relational database is represented in one and
only one way: namely, as attribute values within tuples within relations.
RDM Schemas

External View External View External View

relation
specifications Conceptual Schema

mapping from Internal Schema


relations to
storage layout (files)
Relational Data Definition

application
application
application
program(s)
application
program(s) users of
program(s)
program(s) the data

query processor
security manager
database designer concurrency manager
enters the index manager
definition of
relation schemas
data
SQL DDL = relation definition relation
definition language processor relations
schemas
(CREATE TABLE)
Relation Schemas
and
Relation Instances
Relation Schemas
● A relation is defined by
a name and
a set of attributes

● Each attribute has a name and a domain


○ a domain is a set of possible values
○ types are domain names
○ all domains are sets of atomic values –
RDM does not allow complex data types
○ domains may contain a special null value
Example Relation Schema

relation
name StockItem

Attribute Domain
ItemID string(4)
set of Description string(50)
attributes
Price currency/dollars
Taxable boolean
attribute attribute
names domains
Definition: Relation Schema
● Relation Schema
R(A1, A2, … , An)
○ R is the relation name

○ A1 … An are the attribute names

● Domains are denoted by

dom(Ai)
● degree = the number of attributes
Example Relation Schema

STOCKITEM(ItemId, Description, Price, Taxable)

dom(ItemId) = string(4)
dom(Description) = string(50)
dom(Price) = currency/dollars
dom(Taxable) = boolean

degree of STOCKITEM = 4
Definition: Relation
● A relation is denoted by
r(R)
○ R is the name of the relation schema for the relation

● A relation is a set of tuples

r(R) = (t1, t2, … , tm)


Definition: Relation
● Each tuple is an ordered list of n values


t = < v 1, v 2 , … , v n >
n is the degree of R

● Each value in the tuple must be


in the domain of the corresponding attribute

vi є dom(Ai)
● Alternate notations:
ith value of tuple t is also referred to as

vi = t[Ai] or vi = t.Ai
Example Relation

r(STOCKITEM) =
{ < I119, "Monopoly", $29.95, true >,
< I007, "Risk", $25.45, true >,
< I801, "Bazooka Gum", $0.25, false > }

t2 = < I007, "Risk", $25.45, true >


t2[Price] = t2.Price = $25.45
t2[Price] є dom(Price) =
currency/dollars
Characteristics of Relations
● A relation is a set
○ tuples are unordered
○ no duplicate tuples

● Attribute values within tuples are ordered


○ values are matched to attributes by position

○ alternate definition defines a tuple


as a set of (name,value) pairs,
which makes ordering of tuple unnecessary
(we won’t use this definition)
Characteristics of Relations
● Values in tuples are atomic
○ atomic = non-structured
(similar to primitive types in C++)

○ implication:
no nested relations or other complex data structures

● If domain includes null values,


null may have many interpretations
○ "does not exist"
○ "not applicable"
○ "unknown"
Theory vs. Reality
● The theoretical data model is mathematical:
○ a relation is a set of tuples
○ this is Codd's definition

● In the real-world, the model is practical:


○ efficiency concerns
○ excepted standard: SQL
○ a relation is a table, not a set
○ a relation may have order and duplicates
SQL: Relation States
● A relation is viewed as a table

● The attributes define the columns of the table

● Each row in the table holds related


values for each attribute
○ a row often represents a conceptual entity (object)

● Values in each column must come


from the domain of the attribute
○ the values are instances of the attribute type
Relation: Table Representation

Each row collects related attribute values


StockItem
ItemId Description Price Taxable
I119 Monopoly $29.95 True
I007 Risk $25.45 True
I801 Bazooka Gum $0.25 False

Column values all come from the same domain


Example Relation
Example Schema
Example
State
Constraints
Constraints
● Constraints are restrictions on legal relation states
○ they add further semantics to the schema

● Domain constraints
vi є dom(Ai)
○ values for an attribute must be from
the domain associated with the attribute

● Non-null constraints
○ the domain of some attributes may not include null,
implying that a value for that attribute
is required for all tuples
Key Constraints
● By definition, all tuples in a relation are unique

● Often, we want to restrict tuples further such


that some subset of the attributes
is unique for all tuples

● Example: in the StockItem relation,


no ItemID should appear in more than one tuple

○ ItemID is called a key attribute


Keys and Superkeys
● Any subset of attributes
that must be unique is called a superkey
● If no subset of the attributes of a superkey
must also be unique,
then that superkey is called a key

● Example:

key key

VEHICLE(LicenseNumber, SerialNumber, Model, Year)


superkey
Candidate and Primary Keys
● If a relation has more than one key,
each key is called a candidate key

● One candidate key must be chosen


to be the primary key

● The primary key is the one that will be


used to identify tuples

● If there is only one key, it is the primary key


Candidate and Primary Keys
● Primary keys are indicated
by underlining the attributes that make up that key

candidate key candidate key

VEHICLE(LicenseNumber, VIN, Model, Year)


primary key
Example Keys
candidate keys:

STOCKITEM( ItemId, Description, Price, Taxable )


superkeys:
(ItemId), (Description), (ItemId, Description)
keys:
(ItemId),
(Description)
(assumes that
(ItemId), Description is
(Description) unique for all items)
primary key:
(ItemId)
Integrity Constraints
● Entity integrity constraint
○ no primary key value can be null
○ the primary key is the tuple identifier

● Referential integrity constraint


○ references between relations must be valid
○ the foreign key of a referencing relation
must exist as a primary key in the referenced relation
Example: Referential Integrity

STOCKITEM( ItemId, Description, Price, Taxable )


STORESTOCK( StoreId, Item, Quantity )

STORESTOCK[Item] refers to STOCKITEM[ItemID]

STORESTOCK[Item] is a foreign key referencing


the primary key STOCKITEM[ItemID]

Any value appearing in STORESTOCK[Item]


must appear in STOCKITEM[ItemID]

It must be true that


dom(STORESTOCK[Item]) = dom(STOCKITEM[ItemID])
Referential Integrity
● PK = primary key in R2
● FK = foreign key in R1
● dom(R1[FK]) = dom(R2[PK])
● constraint:
if v є R1[FK] then v є R2[PK]

● note: FK is not necessarily a key of R1


Example: Referential Integrity

STOCKITEM( ItemId, Description, Price, Taxable )


STORESTOCK( StoreId, Item, Quantity )
STORE( StoreID, Manager, Address, Phone )

● (StoreId, Item) is the primary key of STORESTOCK

● STORESTOCK[StoreId] is a foreign key referencing STORE

● STORESTOCK[Item] is a foreign key referencing


STOCKITEM
Referential Integrity:
Diagrammatic Representation

STOCKITEM( ItemId, Description, Price, Taxable )


PK

FK
STORESTOCK( StoreId, Item, Quantity )
FK

PK
STORE( StoreID, Manager, Address, Phone
)
Referential Integrity:
Textual Representation

STOCKITEM( ItemId, Description, Price, Taxable )


STORESTOCK( StoreId, Item, Quantity )
STORE( StoreID, Manager, Address, Phone )

constraints:
STORESTOCK[StoreId] refers to STORE[StoreID]
STORESTOCK[Item] refers to STOCKITEM[ItemId]
Referential Integrity:
Example State

r(STORESTOCK) =
< "S002", "I065", 120 >,
< "S047", "I954", 300 >,
< "S002", "I954", 198 >
StoreId is a foreign key but not a key

all values in FK exist in PK


r(STORE) =
< "S002", "Tom", "112 Main", "999-8888" >,
< "S047", "Sasha", "13 Pine", "777-6543" >
Referential Integrity:
Constraint Violation

r(STORESTOCK) =
< "S002", "I065", 120 >,
< "S047", "I954", 300 >,
< "S333", "I954", 198 >
StoreId “S333” does not exist in PK:
this is an illegal database state
r(STORE) =
< "S002", "Tom", "112 Main", "999-8888" >,
< "S047", "Sasha", "13 Pine", "777-6543" >

Both relation states are legal, but the database state is illegal.
Schema with FKs
State Change and
Constraint
Enforcement
Causes of Constraint Violations
● What can cause a
referential integrity constraint violation?
○ inserting a tuple in R1 with an illegal FK

○ modifying a tuple in R1 to have an illegal FK

○ deleting a tuple in R2 that had the PK referenced


by some FK in R1

● How can a referential integrity constraint


be enforced?
○ reject the operation that attempts to violate it
(may cause other operations to be rejected … transactions)
or
○ repair the violation, by cascading inserts or deletes
Data Manipulation Operations

There are three ways to modify the value of a relation:

● Insert: add a new tuple to R


● Delete: remove an existing tuple from R
● Update: change the value of an existing tuple in R

Delete and Update both require some way


to identify an existing tuple (a selection)
Inserting Tuples

< "S002", "I065", 120 >,


r1(STORESTOCK) = < "S047", "I954", 300 >,
< "S333", "I954", 198 >
insert < "S047", "I099", 267 >

< "S002", "I065", 120 >,


r2(STORESTOCK) = < "S333", "I954", 198 >,
< "S047", "I099", 267 >,
< "S047", "I954", 300 >
any constraint violations?
Deleting Tuples

< "S002", "I065", 120 >,


r2(STORESTOCK) = < "S333", "I954", 198 >,
< "S047", "I099", 267 >,
< "S047", "I954", 300 >
delete tuples with Item = "I954"

< "S002", "I065", 120 >,


r3(STORESTOCK) =
< "S047", "I099", 267 >
Updating Tuples

< "S002", "I065", 120 >,


r3(STORESTOCK) =
< "S047", "I099", 267 >

change the Quantity of tuples


with StoreID = "S002" and Item = "I954" to 250

< "S002", "I065", 250 >,


r3(STORESTOCK) =
< "S047", "I099", 267 >
Analyzing State Changes
● Any update can be viewed as (delete and insert)
update: < "S002", "I065", 120 > to < "S002", "I065", 250 >
is equivalent to
delete: < "S002", "I065", 120 >
insert: < "S002", "I065", 250 >
● Any database state change can be viewed
as a set of deletes and inserts on individual relations

● This makes the analysis of potential constraint


violations a well defined problem
Enforcing Constraints
● constraint enforcement:
ensuring that no invalid database states can exist

● invalid state: a state in which a constraint is violated

● Possible ways to enforce constraints:

○ reject any operation that causes a violation, or

○ allow the violating operation and then attempt


to correct the database
Constraint Violating Operations
● To automate constraint enforcement
the operations that can cause violations
need to be identified

insert delete update

domain, non-null yes no yes

key yes no yes

entity integrity yes no yes

referential integrity yes/FK yes/PK yes/FK/PK


Correcting Constraint Violations

violation correction

ask user to enter a valid value


domain, non-null
or use a default value

ask user to enter a unique key


key
or generate a unique key

ask user to enter a unique key


entity integrity
or generate a unique key

referential integrity force an insert in the PK


FK insertion (may cascade)

referential integrity propagate delete to FK


PK deletion (may cascade)
Summary: Relational Schemas
● A relational schema consists of
a set of relation schemas
and a set of constraints

● Relation schema
○ list of attributes: name and domain constraint
○ superkeys: key constraints
○ primary key: entity integrity constraint

● Foreign keys: referential integrity constraints


○ defined between relation schemas
Schema for
Airline Database
NEXT UP
● skip ahead to Chapter 7:

Translating ER Schemas to Relational Schemas

● then back to Chapter 6:

The Relational Algebra: operations on relations


PREVIEW: ER to Relational

You might also like