01-introduction
01-introduction
TO DAY ’ S AG E N DA
Course Logistics
Relational Model
Relational Algebra
WA I T L I S T
L E C T U R E RU L E S
C O U R S E OV E R V I E W
P RO J E C T S
C O U R S E LO G I S T I C S
C O U R S E LO G I S T I C S
P L AG I A R I S M WA R N I N G
DATA B A S E
DATA B A S E E X A M P L E
F L AT F I L E S T R AW M A N
F L AT F I L E S T R AW M A N
F L AT F I L E S T R AW M A N
F L AT F I L E S : DATA I N T E G R I T Y
F L AT F I L E S : I M P L E M E N TAT I O N
F L AT F I L E S : D U R A B I L I T Y
DATA B A S E M A N AG E M E N T S Y S T E M
DATA M O D E L S
DATA M O D E L S
E A R LY D B M S s
E A R LY D B M S s
E A R LY D B M S s
R E L AT I O N A L M O D E L
Key tenets:
→ Store database in simple data structures (relations).
→ Physical storage left up to the DBMS implementation.
→ Access data through high-level language, DBMS figures
out best execution strategy.
R E L AT I O N A L M O D E L
R E L AT I O N A L M O D E L
R E L AT I O N A L M O D E L : P R I M A R Y K E Y S
R E L AT I O N A L M O D E L : P R I M A R Y K E Y S
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
Artist(id, name, year, country)
id name year country
123 Wu-Tang Clan 1992 USA
ArtistAlbum(artist_id, album_id) 456 Notorious BIG 1992 USA
artist_id album_id 789 GZA 1990 USA
123 11
123 22 Album(id, name, year)
789 22 id name year
456 22 11 Enter the Wu-Tang 1993
22 St.Ides Mix Tape 1994
33 Liquid Swords 1995
DATA M A N I P U L AT I O N L A N G UAG E S ( D M L )
Procedural: ← Relational
→ The query specifies the (high-level) strategy Algebra
to find the desired result based on sets / bags.
R E L AT I O N A L A LG E B R A
R E L AT I O N A L A LG E B R A : S E L E C T
R(a_id,b_id)
Choose a subset of the tuples from a a_id b_id
R E L AT I O N A L A LG E B R A : P RO J E C T I O N
R(a_id,b_id)
Generate a relation with tuples that a_id
a1
b_id
101
contains only the specified attributes. a2 102
→ Can rearrange attributes’ ordering. a2 103
→ Can manipulate the values. a3 104
Πb_id-100,a_id(σa_id='a2'(R))
Syntax: A1,A2,…,An(R) b_id-100 a_id
2 a2
3 a2
R E L AT I O N A L A LG E B R A : U N I O N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
a1 101 a3 103
tuples that appear in either only one a2 102 a4 104
or both input relations. a3 103 a5 105
(R ∪ S)
Syntax: (R ∪ S) a_id b_id
a1 101
a2 102
(SELECT * FROM R) a3 103
UNION ALL a3 103
(SELECT * FROM S); a4 104
a5 105
R E L AT I O N A L A LG E B R A : I N T E R S E C T I O N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains only a_id b_id a_id b_id
a1 101 a3 103
the tuples that appear in both of the a2 102 a4 104
input relations. a3 103 a5 105
Syntax: (R ∩ S) (R ∩ S)
a_id b_id
a3 103
(SELECT * FROM R)
INTERSECT
(SELECT * FROM S);
R E L AT I O N A L A LG E B R A : D I F F E R E N C E
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains only a_id b_id a_id b_id
a1 101 a3 103
the tuples that appear in the first and a2 102 a4 104
not the second of the input relations. a3 103 a5 105
Syntax: (R – S) (R – S)
a_id b_id
a1 101
a2 102
(SELECT * FROM R)
EXCEPT
(SELECT * FROM S);
R E L AT I O N A L A LG E B R A : P RO D U C T
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
a1 101 a3 103
possible combinations of tuples from a2 102 a4 104
the input relations. a3 103 a5 105
(R × S)
Syntax: (R × S) R.a_id
a1
R.b_id
101
S.a_id
a3
S.b_id
103
a1 101 a4 104
a1 101 a5 105
SELECT * FROM R CROSS JOIN S; a2 102 a3 103
a2 102 a4 104
a2 102 a5 105
SELECT * FROM R, S; a3 103 a3 103
a3 103 a4 104
a3 103 a5 105
R E L AT I O N A L A LG E B R A : J O I N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
tuples that are a combination of two a1 101 a3 103
R E L AT I O N A L A LG E B R A : E X T R A O P E R ATO R S
Rename (ρ)
Assignment (R←S)
Duplicate Elimination (δ)
Aggregation (γ)
Sorting (τ)
Division (R÷S)
O B S E R VAT I O N
R E L AT I O N A L M O D E L : Q U E R I E S
DATA M O D E L S
Relational
Key/Value
Graph
Document / Object ← Leading Alternative
Wide-Column / Column-family
Array / Matrix / Vectors
Hierarchical
Network
Multi-Value
15-445/645 (Fall 2022)
45
D O C U M E N T DATA M O D E L
Artist R1(id,…)
⨝
ArtistAlbum R2(artist_id,album_id)
⨝
Album R3(id,…)
D O C U M E N T DATA M O D E L
Artist R1(id,…)
⨝
ArtistAlbum R2(artist_id,album_id)
⨝
Album R3(id,…)
D O C U M E N T DATA M O D E L
CONCLUSION
NEXT CLASS
Modern SQL
→ Make sure you understand basic SQL before the lecture.