EE477 Lecture 2 - Relational Model
EE477 Lecture 2 - Relational Model
Relational Model
2
Data model
● A notation for describing data or information
● Consists of:
○ Structure of the data
○ Operations on the data
○ Constraints on the data
3
Structure of the data
● Referred to as a “conceptual model” of the data
● Higher level than “physical models”, i.e., data structures like arrays and lists
● Example: a relation consists of a schema, attributes, and tuples
4
Operations on the data
● Usually a limited set of operations that can be performed
○ Queries (operations that retrieve information)
SELECT *
○ Modifications (operations that change the database)
FROM Movies
● This is a strength, not a weakness WHERE studioName = ‘Disney’
○ Programmers can describe operations at a very high level AND year = 2013;
○ The DBMS implements them efficiently
○ Not easy to do when coding in C
5
Constraints on the data
● Usually have limitations on the data
● Examples
○ Day of a week is an integer between 1 and 7
○ Age is larger than 0
○ Student IDs are unique
6
Data models
● Relational Most DBMS’s
● Key/Value
● Graph No SQL
● Document (Semi-structured)
● Column-family
● Array/Matrix Machine Learning
● Hierarchical
Obsolete
● Network
7
The relational model
● Structure
○ Based on tables (relations)
○ Looks like an array of structs in C, but this is just one possible implementation
○ In database systems, tables are not stored as main-memory structures and must take into account the
need to access relations on disk
8
The relational model
● Operations
○ Relational algebra
○ E.g., all the rows where genre is “anime”
● Constraints
○ E.g., Genre must be one of a fixed list of values,
no two movies can have the same title
9
The semi-structured model
● Structure <Movies>
○ Resembles trees or graphs, rather than tables or arrays <Movie title=”Oldboy”>
○ <Year>2003</Year>
Represent data by hierarchically nested tagged elements
<Length>120</Length>
● Operations <Genre>mystery</Genre>
</Movie>
○ Involve following path from element to subelements
<Movie title=”Ponyo”>
● Constraints <Year>2008</Year>
…
○ Involve types of values associated with tags </Movies>
○ E.g., <Length> tag values are integers,
each <Movie> element must have a <Length>
10
The key-value model
● Structure
key value
○ (key, value) pairs
○ Key is a string or integer 1000 (oldboy, 2003)
○ Value can be any blob of data
1001 (ponyo, 2008)
● Operations
○ get (key), put(key, value) 1002 (frozen, 2013)
○ Operations on values not supported
● Constraints
○ E.g., key is unique, value is not NULL
11
Comparison of modeling approaches
● Relational model
○ Simple and limited, but reasonably versatile
○ Limited, but useful operations
○ Efficient access to large data
○ A few lines of SQL can do the work of 1000’s of lines of C code
○ Preferred in DBMS’s
● Semi-structured model
○ More flexible, but slower to query
● Key-value model
○ Even more flexible, but cannot query
12
Basics of the relational model
● Relation: two-dimensional table containing data
● Schema: relation name and set of attributes
○ Movies(title, year, length, genre)
● Database schema: set of schemas for the relations of a database columns /
attributes /
● A tuple has one component for each attribute
fields
○ (Oldboy, 2003, 120, mystery)
13
Domains
● Each attribute has a domain
○ A particular elementary type, e.g., integer and string
○ Cannot be a type that can be broken down into components, e.g., set, list, array
14
Equivalent representations of a relation
● A relation is a set of tuples (not a list)
● A schema is a set of attributes (not a list)
● Hence, the order of tuples or attributes of a relation is immaterial
15
Pop quiz
How many ways are there to represent this relation?
16
Relation instances
● A relation is not static
○ Tuples may be inserted, deleted, or updated
○ The schema can also change, but this can be expensive
● Instance: the set of tuples of a relation
● A conventional database system only maintains the current instance
17
Keys of relations
● A set of attributes that uniquely identifies a record
● Many real-world databases use artificial keys like ID’s
○ Employee IDs, student IDs, license numbers, …
Key
18
Keys of relations
● There can be multiple keys, but only one can be a primary key
19
Keys of relations
● Foreign key: attribute(s) whose value is a key of a record in another relation
Will 456
20
Defining a relation schema in SQL
● SQL (Structured Query Language)
○ Principal language to describe and manipulate relational databases
○ Declarative
○ Supports
■ Data Definition Language (DDL): for declaring database schemas
■ Data Manipulation Language (DML): for querying and modifying the database
● Similar analogies in C or Java
○ I.e., declare data vs executable code
21
Relations in SQL
● Tables
○ Can be modified and queried
● Views
○ Relations defined by computation
○ Not stored, but constructed (whole/part) when needed
● Temporary tables
○ Constructed by SQL processor for queries and data modifications
○ Thrown away after use and not stored
22
Data types
● CHAR(n), VARCHAR(n)
○ If CHAR(n) string has fewer than n characters, padded with spaces
○ I.e., if n=4, ‘foo’ is assumed to be ‘foo ’
● BIT(n), BIT VARYING(n)
● BOOLEAN (can have TRUE, FALSE, and UNKNOWN)
● INT, SHORTINT
● FLOAT, REAL, DOUBLE PRECISION, DECIMAL(n,d)
● DATE, TIME
23
Simple table declarations
● To create a table, use CREATE TABLE
24
Modifying relation schemas
● To modify a table, use ALTER TABLE and DROP TABLE
DROP TABLE R;
Existing tuples
ALTER TABLE MovieStar ADD phone CHAR(16); will have NULL
values for attribute
ALTER TABLE MovieStar DROP birthdate; phone
25
Default values
● If you prefer not to have NULLs, use default values
26
Declaring keys
● Declare one attribute to be a key
● Add separate declaration which attributes form a key
○ Need to use this method for multiple-attribute keys
27
Declaring keys
● Can also use UNIQUE instead of PRIMARY KEY to indicate keyness
● Some differences
○ Primary key attributes cannot have NULL values
○ There can only be one primary key,
but there can be multiple unique keys CREATE TABLE MovieStar (
name CHAR(30),
address VARCHAR(30),
gender CHAR(1),
birthdate DATE,
UNIQUE (name, address)
);
28
Declaring foreign-key constraints
● Declare attribute(s) of a relation to be a foreign key
○ The referenced attribute(s) of second relation must be UNIQUE or PRIMARY KEY
○ Values of the foreign key must appear in referenced attributes of some tuple
SELECT *
CREATE INDEX YearIndex ON Movies(year);
FROM Movies
DROP INDEX YearIndex;
WHERE year = 2008
30
Inserting tuples
● A new tuple can be inserted into the relation R using an insertion statement.
○ For any missing attributes of R, the tuple has default values
○ If we provide values for all attributes, the list of attributes can be omitted
31
Deleting tuples
● Use a delete statement to delete every tuple satisfying a condition
○ The tuple must be described by a WHERE clause
○ Be careful: omitting the WHERE clause removes all tuples from table
32
Updating tuples
● Change the components of existing tuples in the database
○ Multiple assignments are separated by commas
UPDATE Movies
SET length = 110, Producer = 123
WHERE title = ‘Ponyo’
AND year = 2008;
33
Tables discussion
● Attributes and tuples are not ordered
● Tables are “flat”
● Physical data independence
○ Tables can be implemented / stored on disk differently without affecting the application program
34
Table implementation
● This table can be stored “row major”, i.e., an array of objects
title year length genre
35
Table implementation
● or “column major”, i.e., one array per attribute
title year length genre
37