CSC 310 DATABASE DESIGN - Relational - Model - April - 2021-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

4/13/2021

CSC 310 DATABASE


DESIGN:
Relational model and algebra

Dr. NYAMSI 12/04/2021 1

Structure: content
I. Introduction
II. Relational Model Concepts
III. Relational Model Constraints
IV. Operations on the Relational Model
V. Relational Algebra
VI. Relational Database Management Systems

Dr. NYAMSI 12/04/2021 2

1
4/13/2021

Structure: objectives
At the end of this notes the reader will be able to:
• Describe Relational Model Concepts
• Describe properties of a relation and relational keys
• Describe relational model/integrity constraints
• Describe The Relational Algebra

Dr. NYAMSI 12/04/2021 3

Introduction
The principles of the relational model were first outlined
• proposed by Edgar Codd in 1969
• centered around its primary element: the relation
• everything is a relation in this model
The more popular models used at that time were hierarchical and network, or even
simple flat file data structures.
Relational database management systems (RDBMS) were designed and soon became
very popular, especially for their ease of use and flexibility in structure.
Dr. NYAMSI 12/04/2021 4

2
4/13/2021

Introduction
Earlier we saw
• how to convert an unorganized text description of information requirements
into a conceptual design,
• by the use of ER diagrams (Covert in the design part of this course).

Dr. NYAMSI 12/04/2021 5

Introduction
The advantage of ER diagrams is that
• They force you to identify data requirements that are implicitly known, but
not explicitly written down in the original description.
• We design the database to what we call Entity/Relation (ER) model
• We therefore convert this ER into a logical design of a relational database.
• The logical model is also called a Relational Model .

Dr. NYAMSI 12/04/2021 6

3
4/13/2021

Relational model concept


We represent a relation as
• A table with columns and rows.
• Each column of the table has a name, or attribute .
• Each row is called a tuple .
• Domain : a set of atomic values that an attribute can take
• Attribute : name of a column in a particular table (all data is stored in tables).

Dr. NYAMSI 12/04/2021 7

Relational model concept


• Each attribute Ai must have a domain , dom(Ai ).
• Relational Schema : The design of one table, containing the name of the
table (i.e. the name of the relation), and the names of all the columns, or
attributes.
Example : STUDENT( SID, SName, SAge, SGPA)

Dr. NYAMSI 12/04/2021 8

4
4/13/2021

Relational model concept


• Degree of a Relation : the number of attributes in the relation's schema.
• Tuple, t, of R( A1, A2, A3, …, An): an ORDERED set of values, < v1, v2,
v3, …, vn>, where each vi is a value from dom( Ai ).
• Relation Instance, r(R): a set of tuples; thus, r( R) = { t1, t2, t3, …, tm}

Dr. NYAMSI 12/04/2021 9

Relational model concept


• Relations are tables with some restrictions
• the order of rows is immaterial
• the order of columns is immaterial
• relations have no duplicate rows ⇒ each relation has one or more key
• A relation can also be seen as a predicate, i.e., a set of properties, assertions.

Dr. NYAMSI 12/04/2021 10

5
4/13/2021

Relational model concept


• Database is a collection of relations, but
• relational data structures are richer than tables
• a data model is not just data structures but also operations for manipulating the data
structures
• A data model provides mechanisms (languages) for defining
• data structures
• operations for retrieval and modification
• integrity constraints
Dr. NYAMSI 12/04/2021 11

Relational model concept


• The relational data model provides
• mechanisms for defining domains and the structure of relations
• several data-manipulation languages (DML): relational algebra, domain re-
lational calculus, tuple relational calculus, update operations
• mechanisms for specifying particular constraints (e.g., key, referential in-tegrity)
and a language for specifying arbitrary constraints.

Dr. NYAMSI 12/04/2021 12

6
4/13/2021

Relational model concept


• Relation= Schema + Value
• Relation schema: R(A1:D1, . . . , An:Dn)
• R is the name of the relation
• Domains(D1, . . . , Dn) are sets of atomic values; domains in R are not
necessarily all distinct
• Attributes A1, . . . , An, all distinct in R
• Structure(A1:D1, . . . , An:Dn) means, the data type of relation tuples
Dr. NYAMSI 12/04/2021 13

Relational model concept


• Domains carry important structural information
• Domains define comparability of values: attribute values can only be
meaningfully compared if they range on the same domain (= typing checking
in programming languages)
• Domains have been underused: in SQL and RDBMSs in general, domains
are restricted to the data types of traditional programming languages (e.g.,
integer, real, character) and date
• In more modern languages (e.g., object-oriented), domains are application
dependent (e.g., employee names, salaries)
Dr. NYAMSI 12/04/2021 14

7
4/13/2021

Relational model concept


• Relational database ≈ collection of relations
• Relational database schema= relation schemas + integrity constraints
• Database value, extension= collection of relation values

Dr. NYAMSI 12/04/2021 15

Relational model concept


• Several schemas
• conceptual: information content from the real world (typically, entity-relationship or
class diagrams of object models)
• logical: information expressed in the data model of the DBMS (typically relational)
• physical: information organized for storage media

Dr. NYAMSI 12/04/2021 16

8
4/13/2021

Relational model concept


• Base and derives relations
• Base relations are those in the community schema, they are stored on disk
• Derived relations (views or external schemas) are defined from combining base
and other derived relations by operations of the relational model
• The definition of derived relations and their correspondence with base relations
are part of the database schema

Dr. NYAMSI 12/04/2021 17

Relational model concept


• Base and derives relations
• Derived relations can be presented as views to application programs
• Views are redundant and consistent with the underlying base relations
• Views simplify application programs
• Views may or may not be materialized (that is stored on disk): this is an
efficiency issue that should be under the control of the DBMS and
invisible to users
Dr. NYAMSI 12/04/2021 18

9
4/13/2021

Relational model concept


• Base and derives relations
• Storing view
• Introduces physical redundancy and complicates integrity enforcement
• Accelerates querying and slows updating.

Dr. NYAMSI 12/04/2021 19

Relational model concept


• Derive relations called Snapshot
• Derived relations that are not synchronized at all times with base relations
• Refreshed from base relations at regular intervals
• Users of snapshots accept to work with data that is not up-to-date to gain
efficiency in access times (particularly for distributed data)
• To easily establish consistency at refreshing times, snapshots may be restricted
to read-only access.

Dr. NYAMSI 12/04/2021 20

10
4/13/2021

Relational model constraints


• Constraint = any prescription (or assertion) on the schema (that is valid for
all database extensions) not defined in the data structure part of the schema
• Constraints cannot be deduced from the current extension of the database
(they are part of the database schema)

Dr. NYAMSI 12/04/2021 21

Relational model constraints


• Relational constraints can be classified as
• keys (super keys, candidate keys, primary key)
• referential integrity
• ”ad-hoc” constraints (all the constraints specific to an application domain)

Dr. NYAMSI 12/04/2021 22

11
4/13/2021

Relational model constraints


• Super key= one (or more) attributes that (together) possess the property of
unique tuple identification
• their values always uniquely identify at most one tuple in the relation
• unicity constraint = no two tuples with the same value for those attributes
• The set of all attributes of a relation is a super key (because the value of a
relation is a set of tuples)

Dr. NYAMSI 12/04/2021 23

Relational model constraints


• Key is a minimal super key, that is a group of attributes that loses the
property of unique identification if any one attribute is removed from the
group.
• In general, a relation has several keys we call them candidate keys
• The definition of keys is intentional information (it belongs to the schema)
⇒ it must be satisfied by all legal extensions of the relation

Dr. NYAMSI 12/04/2021 24

12
4/13/2021

Relational model constraints


• RDBMSs and SQL require that each relation have one primary key
• If a relation has several keys, one of them is privileged as primary key
• The values of a primary key should always be known (while the value of non
primary keys may be null)
• In practice, the primary key is the most useful candidate key, which is naturally
suggested by database design from the application domain (e.g., SID at university)
• For relations that express a relation (e.g.,WorksOn), the primary key is the
combination of primary keys of entity relations involved in the relationship

Dr. NYAMSI 12/04/2021 25

Relational model constraints


• Relational constraint involving two relations R1 and R2 (or twice the same
relation) is called referential integrity;
• There must exist in R2 attribute(s) A2 such that
• A2 has (have) the same domain as the primary key of R1
• each value of A2 in a tuple of R2 occurs as value of the primary key in a tuple of R1

Dr. NYAMSI 12/04/2021 26

13
4/13/2021

Relational model constraints


• Referential integrity is typically relational,
• Ubiquitous in relational DBs
• results from expressing links with equality of values
• expresses comparability information (like domains) for two different attributes.
• expresses information that is taken care of by the data structure (entities and
relationships in entity-relationship (ER) schema).

Dr. NYAMSI 12/04/2021 27

Relational model constraints


• Ad-Hoc constraints are application-dependent constraints, of arbitrary
complexity. It can be structural or of transition.
• Structural constraints
• the salary of each employee is smaller than that of his/her supervisor
• department managers are employees of the department that they manage
• department managers are supervisors
• there are at least 4 employees in each department

Dr. NYAMSI 12/04/2021 28

14
4/13/2021

Relational model constraints


• Transition constraints(concern two successive states of the database)
• salaries are non-decreasing
• the sex of an employee does not change

Dr. NYAMSI 12/04/2021 29

Names for constraints


• Structural aspects of the relational model are presented as
constraints
• Key integrity: each relation has a primary key + possibly candidate
keys
• Domain integrity: attributes must obey the definition of their
domain (i.e.,specific set of values and of applicable operations)
• Entity integrity: primary-key values may not be null
Dr. NYAMSI 12/04/2021 30

15
4/13/2021

Names for constraints


• Candidate key integrity(= key uniqueness): key values must be
unique
• Column integrity: a constraint that supplements domain integrity
(e.g.,MgrStartDate is a date after 1970)
• Row integrity: concerns a single tuple (e.g.,if BDate is before
1950, thenSalary is at least 40000)

Dr. NYAMSI 12/04/2021 31

Operations on the Relational Model


• Data Definition Language: declare a relation, specify a constraint,
define physical structures
• Data Manipulation Language(DML) for access and retrieval
• Algebra, domain calculus, and tuple calculus are generally considered to
be part of the model
• SQL has redefined the corresponding operations and defined some
extensions, notably for aggregate functions
Dr. NYAMSI 12/04/2021 32

16
4/13/2021

Operations on the Relational Model


• Update operations:
• Update of tuples in the current value of a relation (insert, delete, modify)
• Update of the schema: create or delete a relation, add or suppress an
attribute

Dr. NYAMSI 12/04/2021 33

Operations on the Relational Model


• Programming tuple manipulation: insertion and suppression
• If insertion violates a constraint, then either reject insertion or correct
violation
• If suppression violates a referential constraint, then either reject
suppression or propagate suppression to the tuples that reference the
suppressed tuples or set to null the attribute values that reference the
suppressed tuple (unless these attributes are part of the primary key).

Dr. NYAMSI 12/04/2021 34

17
4/13/2021

Relational Algebra
• Relational algebra:
• A collection of operations each acting on one or two relations and
producing one relation as result,
• A language for combining those operations
• The algebra has played a central role in the relational model: algebraic
operations characterize high-level set-at-a-time access

Dr. NYAMSI 12/04/2021 35

Relational Algebra
• The algebra in practice
• It was never a real user language (calculus-based languages and SQL are
simpler)
• Its semantics is clear and a de facto standard
• A precise syntax for the algebra is more complicated than its semantics

Dr. NYAMSI 12/04/2021 36

18
4/13/2021

Relational Algebra
• It consist of six fundamentals operations
• Unary operations: selection, projection and rename
• Binary operations: union, difference and Cartesian product
• And can be extended to other operations:
• set intersection,
• natural join,
• outer join and inner join,
• … etc.

Dr. NYAMSI 12/04/2021 37

Relational Algebra: selection


• Also called Restriction
• Select tuples satisfying a condition
• σcondition(R) ={r ∈ R | condition(r)}
• Intuition: selection is a “horizontal” slice of relation
• Operational meaning: the condition is applied to every tuple; if it is
satisfied, the tuple is kept in the answer.

Dr. NYAMSI 12/04/2021 38

19
4/13/2021

Relational Algebra: selection


• Possible Forms of Conditions in Selection
• Simple Condition ={〈attribute〉〈comparison〉〈attribute〉|
〈attribute〉〈comparison〉〈constant〉}
• Comparisons:=, !=, <, ≤, >, ≥
• Condition: combination of simple conditions with AND, OR, NOT
• Simple conditions are the most frequent
Dr. NYAMSI 12/04/2021 39

Relational Algebra: selection


• Also called Restriction
• Example on student table with some record

Dr. NYAMSI 12/04/2021 40

20
4/13/2021

Relational Algebra: projection


• Projection
• Retain a subset of the attributes (columns) of a relation
•π attributes(relation)

• Intuition: projection is a “vertical” slice of relation


• Duplicate Removal: the result of a projection is a relation
• ⇒ projection involves duplicate removal
Dr. NYAMSI 12/04/2021 41

Relational Algebra: projection


• Many think wrong with duplicate tuples:
• “If something is true, then saying it twice does not make it more true”
even if repetition is a proven pedagogical technique
• Identical things are indistinguishable, there is no need to represent them
twice
• Objects in the real world have only one thing in common: they are all
different.
Dr. NYAMSI 12/04/2021 42

21
4/13/2021

Relational Algebra: projection


• Many think wrong with duplicate tuples:
• Distinct things should have some value to distinguish them
• It is difficult to manipulate “duplicate tuples” unless for counting,
averaging, … etc.
• The obvious modeling of identical objects is as a common description
plus the number of copies of the object.

Dr. NYAMSI 12/04/2021 43

Relational Algebra: combination


• Algebraic operations can be combined
• Nested form: πFName,LName,Salary(σDNo=5(Employee))
• Sequential form: Temp ← σDNo=5(Employee)
R(FirstName,LastName,Salary)←πFName,LName,Salary(Temp)

Dr. NYAMSI 12/04/2021 44

22
4/13/2021

Relational Algebra: combination


• Several relational algebra operations may be needed to express a
given request:
• by nesting several algebraic operations within a single relational algebra
expression
• by applying operations one at a time in a sequence of steps and creating
named intermediate relations by assignment operations
• the correspondence between nested form and sequential form is
immediate
Dr. NYAMSI 12/04/2021 45

Relational Algebra: combination


• Nesting and closure
• the result of an algebraic operation is a relation
• algebraic operations can be nested like functions
• closure is essential for the full power of the algebra

Dr. NYAMSI 12/04/2021 46

23
4/13/2021

Relational Algebra: combination


• Nesting and closure
• nesting is a classical functional composition
• closure makes nesting freely usable for combining operations, the
result of every algebraic operation is a relation and it can be used as
operand of another algebraic operation

Dr. NYAMSI 12/04/2021 47

Relational Algebra: combination


• Nesting and closure
• closure can be violated by
• the definition of language structure (SQL does )
• operations whose result is not a relation (like “relations” with
duplicate tuples)

Dr. NYAMSI 12/04/2021 48

24
4/13/2021

Relational Algebra: combination


• What is really needed is “compositionality”
• The result of a query can be used as argument for another query
• This is a version of what is called orthogonality (the generality of
combining pieces of the definition of a language)

Dr. NYAMSI 12/04/2021 49

Relational Algebra: set operations


Standard operations on sets are automatically applicable to relations
• Union compatibility
• relations have to be defined on the same domains
• type compatibility would be more adequate
• more precise definition: two relations R and S are union-compatible if there is a one-to-
one correspondence between attributes of R and attributes of S such that
corresponding attributes are associated with the same domain
• A mechanism for defining the attribute names of the result is needed
Dr. NYAMSI 12/04/2021 50

25
4/13/2021

Relational Algebra: set operations


A relation is a set of tuples. Union, intersection, and difference on union-
compatible relations R and S have their usual meaning:
• R∪S= all tuples (without duplication) in R, in S, or in both
• R∩S= all tuples in both R and S
• R−S= all tuples in R but not in S (also equal to R-(R-S))
• Union compatibility is similar to type checking in programming
languages
Dr. NYAMSI 12/04/2021 51

Relational Algebra: set operations


• Cartesian product
• The Cartesian product associates every tuple of the first relation
with every tuple of the second one
• The result relation has all the attributes of the operand relations: if
the attributes of the operand relations are not all distinct, some of
them must be renamed.

Dr. NYAMSI 12/04/2021 52

26
4/13/2021

Relational Algebra: set operations


• Cartesian product
• The relational algebra thus needs a renaming operation, with as a
possible syntax:
• rename (relation name, (oldname→newname, ...) )
• Example: rename(DeptLocations,(DNumber→DNo)

Dr. NYAMSI 12/04/2021 53

Relational Algebra: set operations


• Cartesian product
• The Cartesian product associates every tuple of one relation
with every tuple of the other (this is not a very useful
operation)
• The following operation is more useful
πDNumber,DName,MgrSSN,MgrStartDate,DLocation(σDNumber=DNo(Temp))
Dr. NYAMSI 12/04/2021 54

27
4/13/2021

Relational Algebra: set operations


• Cartesian product
• Temp ← Department × DeptLocations
• The result of precedent operation is a JOIN of relations
Department and DeptLocations, that associates each department
with its locations

Dr. NYAMSI 12/04/2021 55

Relational Algebra: Join


• Join of relations
• The restriction retains only the tuples of the Cartesian product where the
values of DNumber and of DNo are equal
• The projection suppresses the DNo attribute
• All these operations (Cartesian product, selection, and projection) can
can be expressed as a single algebra operation: the join
• The join is a fundamental operation for meaningfully creating bigger
relations from smaller ones: but it is not always the inverse of projection.

Dr. NYAMSI 12/04/2021 56

28
4/13/2021

Relational Algebra: Join


• Join of relations
• Join(∞) combines two relations into one on the basis of a condition
• R∞condition S ={concat(r,s)|r ∈ R ∧ s ∈ S ∧ condition(r,s)}
• Remember that, if R and S have an attribute in common, then
some attributes of R and/or S have to be renamed

Dr. NYAMSI 12/04/2021 57

Relational Algebra: Join


• Join of relations
• The condition can be more general than a test for equality of 2
attribute values
• simple condition =〈R’s attribute〉〈comparison〉〈S’s attribute

• comparison: =,!=, <, ≤, >, ≥
• condition: combination of simple conditions with AND
Dr. NYAMSI 12/04/2021 58

29
4/13/2021

Relational Algebra: Join


• Join of relations
• Two ways of defining the semantics of joins
• declarative: create a tuple in the result for each pair of tuples in
the relation arguments that satisfy the condition
• operational(evaluation strategy): the condition is applied to
every tuple of the Cartesian product; if it is satisfied, the tuple
is kept in the answer.
Dr. NYAMSI 12/04/2021 59

Relational Algebra: Join


• Join of relations
• The join can be evaluated as a combination of Cartesian
product, selection, and projection
• But this is not an efficient evaluation strategy;
• There are various strategies for implementing joins;

Dr. NYAMSI 12/04/2021 60

30
4/13/2021

Relational Algebra: Join


• Join of relations • if r.A = s.B then
• A basic method to implement • concat(r,s)⇒result
the operation definition above • Fi
is with nested loops:
• End
• R∞A=BS =
• end
• for each r ∈ R do
• for each s ∈ S do
Dr. NYAMSI 12/04/2021 61

Relational Algebra: Join


• Kinds of Join
• Theta Join: it is the general join (when all the attributes of the operand relations
appear in the result of the join and the join condition is not simply a test of equality of
2 attributes)
• Equijoin: when the join condition is a simple equality (e.g., A = B)
• Natural Join= equijoin + only one of the attributes tested for equality is included in
the result.

Dr. NYAMSI 12/04/2021 62

31
4/13/2021

Relational Algebra: Join


• Relative Order of Selection and Join
• Selection on an operand of the join, or selection “before” or
“inside” join
• Selection on the result of the join, or selection “after” or
“outside” join

Dr. NYAMSI 12/04/2021 63

Relational Algebra: Join


• Query optimization
• The two formulations of the query above are equivalent (it
should be clear that they produce the same result)
• The first one does the selection before the join (i.e., the result
of the selection serves as operand of the join), while the
second one evaluates the selection on the result of the join
• Such equivalences are frequent in the relational algebra
Dr. NYAMSI 12/04/2021 64

32
4/13/2021

Relational Algebra: Join


• Query optimization
• Query optimizers of current relational technology are able to
perform comparativeevaluation of performance and select a
good strategy

Dr. NYAMSI 12/04/2021 65

Relational Algebra: Join


• Relational Completeness
• A language is ”relational complete” if it has at least the power of the
relational algebra
• Relational completeness is the only widely-accepted measure of power
(besides computational completeness)
• Domain calculus and tuple calculus have the same power of expression
as the relational algebra
Dr. NYAMSI 12/04/2021 66

33
4/13/2021

Relational Algebra: Join


• Equivalences (Theory)
• Not all operations of the algebra are independent
• {σ, π,∪,−,×} is a complete set, it has all the expression power
of the algebra
• ∩,∞, and ÷ can be derived from them

Dr. NYAMSI 12/04/2021 67

Relational Algebra: Join


• Equivalences (Theory)
• R∩S= (R∪S)−((R−S)∪(S−R)) = R − (R − S)
• R ∞condition S=σcondition(R×S)
• R∗conditionS=πattr(σcondition(R×S))
• … etc.
Dr. NYAMSI 12/04/2021 68

34
4/13/2021

Relational Algebra: Join

• σc1∧c2(R) =σc1(σc2(R))
• σc1(σc2(R)) =σc2(σc1(R))
• If A⊆A1, then πA(R) =πA(πA1(R))
• πA(σc(R)) = σc(πA(R)) if attributes in c ⊆ attributes in A
• σc(R ∞ S) = σc(R) ∞ S if attributes in c ⊆ attributes in R
Dr. NYAMSI 12/04/2021 69

Relational Algebra: Join

• πA,B(R∞cS) =πA(R)∞cπB(S) if c involves only


attributes in A of R and in B of S
• πA,B(R∞cS) =πA,B(πA,A1(R)∞cπB,B1(S)) if c involves
attributes in A, A1 of R and in B, B1 of S

Dr. NYAMSI 12/04/2021 70

35
4/13/2021

Relational Algebra: Join

• (R∞S)∞T=R∞(S∞T)
• σc(R∪S) =σc(R)∪σc(S)
• πA(R∪S) =πA(R)∪πA(S)

Dr. NYAMSI 12/04/2021 71

Relational Algebra: Join


• Outer Joins
• Outer Joins preserve information from the operands (outer joins are
“lossless”)
• Left Outer Join R X S: retains all tuples of the left operand relation
R: if, for a tuple of R, no matching tuple is found in S, the attribute
values corresponding to S in the result are set to null
• Right Outer Join S in R X S: retains all tuples of S
• Full Outer Join(X): retains all tuples in both relations
Dr. NYAMSI 12/04/2021 72

36

You might also like