0% found this document useful (0 votes)

20 views

Lec2 Notes

The document discusses tabular data representations and relational algebra operations that can be performed on them. It introduces the concept of relations (tables) and schemas. Common relationships like one-to-many and many-to-many are explained. Relational algebra operations like projection, selection, cross product, and join are defined. Examples are provided to demonstrate how to use these operations to extract relevant records from multiple tables, such as finding band shows for a specific band. SQL is introduced as a popular embodiment of relational algebra that allows users to declaratively specify what they want without specifying how it will be achieved.

Uploaded by

hancocker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Lec2 Notes

Uploaded by

hancocker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

9/16/2019 Lecture 2

Recap diagram from last time. (Slide 1)

Going to start with how to represent data.

Tabular data is the norm -- could be CSVs, spreadsheets, databases or dataframes

Today we're going to talk about tabular data representations ("relations") and operations
over them ("relational algebra" + SQL)

Basic tabular representation

typed records
ordered / unordered?
nested?

Example:

bandfan.com

members
id
names
birdthdays
addresses
emails

sam, 1/1/2000, 32 vassar st, srmadden

tim, 1/2/1990, 46 pumpkin st, timk
...

bands
id
name
genre
...

We call the names and types of fields in a table the "schema"

Key challenge:
data needs to capture *relationships* between multiple data sets

examples:
members are fans of bands
bands play in shows
...
employees in departments working on projects
musicians in bands signed with labels
students in classes in universities
cars made by manufacturers bought by customers
parents with children who attend school
patients of doctors in different hospitals
...

how to represent these relationships?

why is this complicated?

Different types of relationships:

One to many: each member is a fan of many bands
Many to many: each band plays multiple shows, multiple bands can play at a
show

How to represent this:

Try 1 (Show slide)

Member-band-fans
What's wrong with this representation?
Duplicate info - why is that bad?
Inconsistency
Wasted space
No ability to represent missing data
Add NULL?

Try 2:
Still redundant information

Try 3:
Eliminates redundancy

This is a general approach: for many to many relationships, create a relationship table
to eliminate redundancy

Generally works but can get complicated when you start adding complex restrictions;
for example, suppose we wanted to allow each member to be a fan of just one band per
genre?

It's not possible to represent this in a single table without duplicating information, or
requiring me to connect several tables together to do it

What about one to many relationships? Show slide -- can add a reference column to
the original table

How to devise a schema? Most common way is to write down the nature of the
relationships (one to many, many to one), as well as the attributes, and then the tables
that represent it. Sometimes people use what's called an entity relationship diagram.

This is 95% of what you need to know about database theory….

Study break

Part II - Operations on Relations

We're going to study lots of different ways to manipulate tables -- and of course it's
possible to perform arbitrary transformations over them with programs.

Suppose we just want to focus on the problem of extracting a set of records of interest
from a collection of tables.

We need to find a way to extract columns and rows of interest, and a way to follow
paths from one table to another. A fancy name for this is a relational algebra.
Here, a relation is just a table with a schema, with unordered rows and no duplicates

Algebra just refers to the fact that we have set of operations over relations that is
closed, i.e., each operation on a relation (or pair of relations) produces another relation.

Call a collection of relations a "database"

Main operations:

Projection (π(T,c1, …, cn)) -- select a subset of columns c1 .. cn

Selection (sel(T, pred)) -- select a subset of rows that satisfy pred
Cross Product (T1 x T2) -- combine two tables
Join (T1, T2, pred) = sel(T1 x T2, pred)

Example showing how join & select works -- find creed shows

Plus various set operations (UNION, DIFFERENCE, etc)

Notice that basic ops are all set oriented -- i.e., they produce another valid relation

Although we won't go into it much, one of the cool properties of these operations is that
they obey interesting algebraic identities that allow a system that executes relational
algebra expressions to choose the order in which it does work, for example:

sel reordering
Sel1(Sel2(A)) = Sel2(Sel1(A))
sel push down
Sel(A join B, pred) = Sel(A, pred) join Sel(b, pred)

Find the dates of Creed shows

proj ( join (sel(bands,name="creed"), shows, shows.bandid = bands.id), shows.date)

Show data flow diagram slide

This suggests a natural implementation -- we aren't going to talk much about

implementations of the low level operators or executors -- although we will revisit a bit
later, but it's good to have a mental model of this.
Ex 2: Find the bands tim likes

Mbf = Member-band-fans

join(join(sel(fans, name='tim'), mbf, mbf.fanid = fans.id), bands, bands.id = mbf.bandid)

SQL -- most popular physical embodiment of relational algebra

Show a few example SQL queries (see SQL querie)

Note that SQL is "Declarative" - we say what we want, not how to achieve it
Even for a simple selection, may be:
1) Iterating over the rows
2) Keeping table sorted by primary key and do binary search
3) Keep the data in some kind of a tree structure and do logarithmic search

Note that as a user of a SQL database, you don't need to know how the system is
evaluating the query, or even what the physical representation of the data is.

SQL provides "physical data independence" -- of course there is some underlying

representation of the data, but no matter the representation, the same SQL queries will
still run over it.

This can be both a blessing and a curse -- cool because as a user you don't have to
worry about it, but bad because it can make understanding bad performance hard.

Show some examples of indexes / plans:

SELECT fans.name
FROM bands
JOIN band_likes bl ON bl.bandid = bands.id
JOIN fans ON fans.id = bl.fanid
WHERE bands.name = 'Justin Bieber'
Look at physical plan chosen
Note effect of creating an index on bands.name
For small bands table, has no effect
For larger table, will choose to use index
Depends on clustering

Database Design For Mere Mortals
33% (6)
Database Design For Mere Mortals
30 pages
Quiz 6 L1-L4
100% (7)
Quiz 6 L1-L4
117 pages
Database Systems The Complete Book
No ratings yet
Database Systems The Complete Book
25 pages
First Course in Database Systems 3rd
100% (3)
First Course in Database Systems 3rd
487 pages
Relational Database
No ratings yet
Relational Database
17 pages
Unit_2
No ratings yet
Unit_2
85 pages
unit 2 relational model
No ratings yet
unit 2 relational model
82 pages
01-relationalmodel (1)
No ratings yet
01-relationalmodel (1)
5 pages
Relational Model Relational Model
No ratings yet
Relational Model Relational Model
43 pages
Relational Model: - Example: If
No ratings yet
Relational Model: - Example: If
15 pages
CH 2
No ratings yet
CH 2
59 pages
Advanced D.base 4
No ratings yet
Advanced D.base 4
20 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
The Relational Database Model: Database Systems: Design, Implementation, and Management
No ratings yet
The Relational Database Model: Database Systems: Design, Implementation, and Management
52 pages
Relational Theory Database Technology (DBTECO601) : Thomas Devine Thomas - Devine@lyit - Ie November 15, 2007
No ratings yet
Relational Theory Database Technology (DBTECO601) : Thomas Devine Thomas - Devine@lyit - Ie November 15, 2007
49 pages
Week 3: Relational Model and Relational Algebra (Part I) : Database System Concepts
No ratings yet
Week 3: Relational Model and Relational Algebra (Part I) : Database System Concepts
37 pages
DBMS2
No ratings yet
DBMS2
79 pages
Chapter 3 - The Relational Database Model
No ratings yet
Chapter 3 - The Relational Database Model
36 pages
An Introduction To Database Systems Bipin C.desaI
No ratings yet
An Introduction To Database Systems Bipin C.desaI
849 pages
Chapter 2: Relational Model
No ratings yet
Chapter 2: Relational Model
34 pages
TDA357-L12-RelationalAlgebra
No ratings yet
TDA357-L12-RelationalAlgebra
38 pages
01 Introduction
No ratings yet
01 Introduction
4 pages
W2 DBMS
No ratings yet
W2 DBMS
18 pages
learning-material-UNIT 2
No ratings yet
learning-material-UNIT 2
15 pages
Unit III - Relational Database Model
No ratings yet
Unit III - Relational Database Model
18 pages
notes dbms2
No ratings yet
notes dbms2
47 pages
Database Cheatsheet
No ratings yet
Database Cheatsheet
6 pages
Relational Model: What Are Query Languages?
No ratings yet
Relational Model: What Are Query Languages?
12 pages
Model
No ratings yet
Model
7 pages
The Relational Data Model
No ratings yet
The Relational Data Model
53 pages
CSC311: Database Systems Chap 3: Relational Model: Lecture Contents
No ratings yet
CSC311: Database Systems Chap 3: Relational Model: Lecture Contents
56 pages
Database Management Systems Week 3
No ratings yet
Database Management Systems Week 3
36 pages
Databases: Wednesday, January 21, 2009 3:20 PM
No ratings yet
Databases: Wednesday, January 21, 2009 3:20 PM
7 pages
Lecture No 04: By: Syed Aun Irtaza
No ratings yet
Lecture No 04: By: Syed Aun Irtaza
10 pages
Chapter 2-Query Processing and Optimi
No ratings yet
Chapter 2-Query Processing and Optimi
43 pages
Unit Ii
No ratings yet
Unit Ii
34 pages
UNIT II
No ratings yet
UNIT II
34 pages
Day1.1 DBMS
No ratings yet
Day1.1 DBMS
63 pages
Relational and Relational Algebra
No ratings yet
Relational and Relational Algebra
49 pages
Relational Model
No ratings yet
Relational Model
46 pages
Chapter3-Relational Model
No ratings yet
Chapter3-Relational Model
100 pages
Query Execution
No ratings yet
Query Execution
87 pages
Relational Model
No ratings yet
Relational Model
64 pages
Summary-Booklet - Jenna Tutorials
No ratings yet
Summary-Booklet - Jenna Tutorials
15 pages
03 - DBMS - Relational
No ratings yet
03 - DBMS - Relational
41 pages
Week 4: Relational Algebra (Part II) : Database System Concepts
No ratings yet
Week 4: Relational Algebra (Part II) : Database System Concepts
35 pages
Chapter 4 - RA
No ratings yet
Chapter 4 - RA
59 pages
Chapter 3: Data Relational Model: Prepared By: Norzelan Bin Saleh
No ratings yet
Chapter 3: Data Relational Model: Prepared By: Norzelan Bin Saleh
42 pages
Relational DBMS
100% (1)
Relational DBMS
29 pages
Chapter03 Updated
No ratings yet
Chapter03 Updated
70 pages
Intro To Databases
No ratings yet
Intro To Databases
5 pages
Lecture #01: Relational Model & Relational Algebra: 1 Databases
No ratings yet
Lecture #01: Relational Model & Relational Algebra: 1 Databases
4 pages
Topic 4
No ratings yet
Topic 4
10 pages
Relational Model
No ratings yet
Relational Model
74 pages
2-relational-model-intro
No ratings yet
2-relational-model-intro
32 pages
Chaper No.1 Ravindra Babasaheb Nagare DMA
No ratings yet
Chaper No.1 Ravindra Babasaheb Nagare DMA
15 pages
Chapter 6
No ratings yet
Chapter 6
15 pages
Schema Relational Algebra Soln
No ratings yet
Schema Relational Algebra Soln
5 pages
lecture 4_
No ratings yet
lecture 4_
16 pages
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
Data Structures and Algorithm
From Everand
Data Structures and Algorithm
Knowledge Flow
No ratings yet
(Unit 3) Introduction To SQL: SQL (Structured Query Language)
No ratings yet
(Unit 3) Introduction To SQL: SQL (Structured Query Language)
30 pages
Spring 2024_CS403P_2
No ratings yet
Spring 2024_CS403P_2
5 pages
21CSC205P DBMS UNIT I (1)
No ratings yet
21CSC205P DBMS UNIT I (1)
154 pages
Data Management and Database Design: INFO 6210 Week #2
No ratings yet
Data Management and Database Design: INFO 6210 Week #2
32 pages
Dbms
100% (1)
Dbms
54 pages
RDBMS Notes
No ratings yet
RDBMS Notes
6 pages
DBMS Notes
No ratings yet
DBMS Notes
27 pages
Top 50 SQL Questions
No ratings yet
Top 50 SQL Questions
15 pages
UML diagram
No ratings yet
UML diagram
1 page
Database Fundamentals: INFM 603 - Information Technology and Organizational Context
No ratings yet
Database Fundamentals: INFM 603 - Information Technology and Organizational Context
35 pages
CS301 DATABASE MANAGEMENT SYSTEM _IMSC (MID_SP23)
No ratings yet
CS301 DATABASE MANAGEMENT SYSTEM _IMSC (MID_SP23)
1 page
Unit 18
No ratings yet
Unit 18
7 pages
AD DD Package
No ratings yet
AD DD Package
2 pages
Chapter 02 Quick
No ratings yet
Chapter 02 Quick
27 pages
Major Assignment: Sample Tables
No ratings yet
Major Assignment: Sample Tables
3 pages
RDBMS Notes Unit 4
No ratings yet
RDBMS Notes Unit 4
24 pages
Lec10 Normalization PDF
No ratings yet
Lec10 Normalization PDF
50 pages
Oracle
0% (1)
Oracle
95 pages
Exercises: 6.1 Answer: Create Table
No ratings yet
Exercises: 6.1 Answer: Create Table
3 pages
Entity
No ratings yet
Entity
16 pages
Database Systems 1 Theory
No ratings yet
Database Systems 1 Theory
3 pages
Mandatory Assignment 2
No ratings yet
Mandatory Assignment 2
9 pages
Restaurant Management System Database Schema: Inventory
No ratings yet
Restaurant Management System Database Schema: Inventory
18 pages
Slowly Changing Dimensions
No ratings yet
Slowly Changing Dimensions
26 pages
DBMS Module 2
100% (1)
DBMS Module 2
24 pages
MS SQL
No ratings yet
MS SQL
95 pages
Assignment1 DDL and DML (As-Level)
No ratings yet
Assignment1 DDL and DML (As-Level)
4 pages
SQL Assessment
No ratings yet
SQL Assessment
3 pages
DBMS - Practical Sample
No ratings yet
DBMS - Practical Sample
10 pages

Lec2 Notes

Uploaded by

Lec2 Notes

Uploaded by

9/16/2019 Lecture 2

Recap diagram from last time. (Slide 1)

Going to start with how to represent data.

Tabular data is the norm -- could be CSVs, spreadsheets, databases or dataframes

Basic tabular representation

sam, 1/1/2000, 32 vassar st, srmadden

We call the names and types of fields in a table the "schema"

how to represent these relationships?

Different types of relationships:

How to represent this:

Try 1 (Show slide)

This is 95% of what you need to know about database theory….

Part II - Operations on Relations

Call a collection of relations a "database"

Projection (π(T,c1, …, cn)) -- select a subset of columns c1 .. cn

Plus various set operations (UNION, DIFFERENCE, etc)

Find the dates of Creed shows

proj ( join (sel(bands,name="creed"), shows, shows.bandid = bands.id), shows.date)

Show data flow diagram slide

This suggests a natural implementation -- we aren't going to talk much about

join(join(sel(fans, name='tim'), mbf, mbf.fanid = fans.id), bands, bands.id = mbf.bandid)

SQL -- most popular physical embodiment of relational algebra

Show a few example SQL queries (see SQL querie)

SQL provides "physical data independence" -- of course there is some underlying

Show some examples of indexes / plans:

You might also like