0% found this document useful (0 votes)

66 views22 pages

Final Lec

The document provides administrative information and announcements for a class including details about the final exam, review session, and upcoming office hours. It also shares several quotes about learning, knowledge, and databases. Key lessons covered include the benefits of declarative languages, indexing, partitioning data to optimize queries, and the importance of concurrency control and recovery in database systems.

Uploaded by

raw.junk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views22 pages

Final Lec

Uploaded by

raw.junk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 22

Administrivia

Final Exam
Tuesday, 5/20, 5-8 pm
Cumulative, stress end of semester
2 cribsheets
Final Review Session
Watch for announcement

Office Hours
Next week
Tentative office hours on 5/15, watch
web page

As you study...
"Reading maketh a full man; conference a ready
man; and writing an exact man."
-Francis Bacon
"If you want truly to understand something, try
to change it."
-Kurt Lewin
"I hear and I forget. I see and I remember. I do
and I understand."
-Chinese Proverb.
"Knowledge is a process of piling up facts;
wisdom lies in their simplification."
-Martin H. Fischer

Database Lessons to Live By

If we do well
here, we shall
do well there:
I can tell you
no more if I
preach a whole
year
-- John
Edwin (17491790)

Recall Lecture 1!!

Lessons of Data Independence
High-level, declarative programming
Maintenance in the face of change
Automatic re-optimization

Data integrity
Declarative consistency (constraints,
FDs)
Concurrent access, recovery from
crashes.

Simplicity is Beautiful
The relational model is simple
simple query language means simple implementation
model
basically just indexes, join algorithms,
sorting, grouping!
simple data model means easy schema evolution
simple data model provides clean analysis of
schemas (FDs & NFs are essentially automatic)
Every other structured data model has proved to be
a wash
XML has found a niche, but not as a database
Theres a reason that the backend of web search
looks so much like a relational database.

Bulk Processing & I/O Go

Together
Disks provide data a page at a time
Databases deal with data a set at a time
sets usually bigger than a page
means I/O costs are usually justified.
much better than other techniques, which
are object-at-a-time
Set-at-a-time allows for optimization
can do bulk operations (e.g. sort or hash)
or can do things tuple-at-a-time (e.g.
nested loops)

Optimize the Memory

Hierarchy
DBMS worries about Disk vs. RAM
spend lotsa CPU cycles planning disk access
I/O cost hides the think time
Similar hierarchies exist in other parts of a
computer
various caches on and off CPU chips
less time to spare optimizing here
Change is happening here!
Disk is the new tape
Flash is the new disk
RAM is really big

Query Processing is
Predictable
Big queries take many predictable steps
unlike typical OS workloads, which depend on
what small task users decide to do next
DBMSs can use this knowledge to optimize
For caching, prefetching, admission control,
memory allocation, etc.
These lessons should be applied whenever you
know your access patterns
again, especially for bulk operations!

Applied Algorithm Analysis

Know the practical costs of your algorithms
The optimizer needs to know anyway
How many disk I/Os really needed to access a
B+Tree?
In many applications, the bottlenecks determine
the cost model
e.g. I/O is traditional DB bottleneck
in another setting it might be network, or
processor cache locality
this affects the practical analysis of the
algorithm

Indexing Is Simple,
Powerful
Hash indexes easy and quick for equality
worth reading about linear hashing in the
text
Trees can be used for just about anything
else!
each tree level partitions the dataset
labels in the tree direct query traffic
to the right data
all you need to think about in
designing a tree is how to partition, and
how to label!

Not enough memory?

Partition!
Traditional main-memory algorithms can
be extended to disk-based algorithms
partition input (runs for sorting,
partitions for hash-table)
process partitions (sort runs, hash
partitions)
merge partitions (merge runs,
concatenate partitions)
Sorting & hashing very similar!
their I/O patterns are dual

Declarative languages are

great!
Simple: say what you want, not how to get it!
Should correctly convert to an imperative language
Codds Theorem says rel. calc. = rel. alg.
no such theorem for text ranking :-(
If you can convert in different ways, you get to
optimize!
hides complexity from user
accomodates changes in database without requiring
applications to be recompiled.
Especially important when
App Rate of Change << Physical Rate of Change
A reborn trend in computing
Declarative networking, security, robotics, natural
language processing, distributed systems,

SQL: The good, the bad, the

ugly
SQL is very simple
SELECT..FROM..WHERE
Well...SQL is kind of tricky
aggregation, GROUP BY, HAVING
OK, OK. SQL is complicated!
duplicates & NULLs
Subqueries
dups/NULLs/subqueries/aggregation together!
Remember: SQL is not entirely declarative!!!
But, it beats the heck out of writing (and
maintaining!) C++ or Java programs for every query

Query Operators &

Optimization
Query operators are actually all similar:
Sorting, Hashing, Iteration
Query Optimization: 3-part harmony
define a plan space
estimate costs for plans
algorithm to search in the plan space
for cheapest
Research on each of the 3 pieces goes on
independently! (Usually)
Nice clean model for attacking a hard
problem

Database Design
(And you thought SQL was confusing!)
This is not simple stuff!!
requires a lot of thought, a lot of
tools
theres no cookbook to follow
decisions can make a huge difference
down the road!
The basic steps we studied (conceptual
design, schema refinement, physical
design) break up the problem somewhat,
but also interact with each other
Complexity in DB design pays off at
query time, and in consistency

CC & Recovery: House

Specialties
RDBMSs nailed concurrency and reliability
transactions & 2-phase locking
write-ahead-logging
details are tricky, worked out over 20
years!
Also models for relaxing transactions
Lower degrees of consistency
Other systems are now taking pieces
Journaling file systems
Transactional memories
Web infrastructure locking services
(Chubby)

The Rebirth of Information

Retrieval
A lonely backwater in the 70s, 80s, early 90s
Now a driver of research and industry
We saw that its easy to get working
But theres tons more!
Watering hole for ideas from databases, AI,
approximation algorithms, distributed systems,
power-efficient processors, HCI,
Kicking off the new generation of parallel
dataflow
Pushing to yet another level of scalability
Always a game-changer

Databases: The natural way

to leverage parallelism &
distribution
The promise of CS research for the last 15 yrs:
There are millions of computers
They are spread all over the world
Harness them all: worlds best supercomputer!
This was routinely disappointing
except for data-intensive applications (DBs, Web)
2 reasons for success
data-intensive apps easy to parallelize & distribute
lots of people want to share data
fewer people want to share computation!
The parallelism craze is BACK
Intel, AMD, etc need us to take advantage of
parallelism
They have nothing else to do with all those transistors!

Google convinced people that bulk data analysis is cool

Map/Reduce
Incoming freshman will get this in 61A and through the curriculum

More, more, Im still not

satisfied
Grad classes @ Berkeley
CS262A: a grad level intro to DBMS and OS research
-- Tom
Lehrer
CS286: grad DBMS course
read & discuss lots research papers
See evolution of different communities on similar
issues

undertake a research project -- often big successes!

CS298-12 Database group seminar
Upcoming seminar courses
Alon Halevy from Google will offer something in Fall
08

But wait, theres more!

Graduate study in databases
Used to be rare (Berkeley + Wisconsin)
You are living in the golden age:
Berkeley, Wisconsin, Stanford, MIT, Brown, Cornell, CMU, Maryland,
Penn, Duke, Washington, Michigan, many others...

Tons of DB-related companies, lots of hiring

Search companies
DB elephants : IBM, Oracle, MS
Midstage DB startups: ANTs, Greenplum, Netezza
Early startups: Truviso, Streambase, Coral8, Vertica, Paraccel
Enterprise app firms: e.g., SAP, Salesforce
Every Web 2.0 company!
A note: ask for the job you want
E.g. not just engineering -- sales, marketing, R&D, management,
etc.

Parting Thoughts
"Education is the ability to listen to almost
anything without losing your temper or your selfconfidence."
-Robert Frost
"It is a miracle that curiosity survives formal
education."
-Albert Einstein
Humility...yet pride and scorn;
Instinct and study; love and hate;
Audacity...reverence. These must mate
-Herman Melville
"The only thing one can do with good advice is to
pass it on. It is never of any use to oneself."
-Oscar Wilde

An Introduction To Databases With Web Applications - Martyn Prigmore
No ratings yet
An Introduction To Databases With Web Applications - Martyn Prigmore
713 pages
Entity Relationship Diagram and Basic Database Modeling
No ratings yet
Entity Relationship Diagram and Basic Database Modeling
294 pages
Database Management Systems Week 1
No ratings yet
Database Management Systems Week 1
20 pages
IDBMS1
No ratings yet
IDBMS1
449 pages
Cours 1
No ratings yet
Cours 1
182 pages
Database Management Systems CS 564: Lecture #1
No ratings yet
Database Management Systems CS 564: Lecture #1
47 pages
A Practical Guide To Database Design Second Edition PDF
No ratings yet
A Practical Guide To Database Design Second Edition PDF
431 pages
Database Management 1
No ratings yet
Database Management 1
27 pages
Module 1
No ratings yet
Module 1
72 pages
01 Intro
No ratings yet
01 Intro
52 pages
Full Absolute Beginner S Guide To Databases 1st Edition John Petersen Ebook All Chapters
No ratings yet
Full Absolute Beginner S Guide To Databases 1st Edition John Petersen Ebook All Chapters
77 pages
Execution
No ratings yet
Execution
37 pages
Introduction To Database Systems
No ratings yet
Introduction To Database Systems
98 pages
01 Intro
No ratings yet
01 Intro
20 pages
1.1 Adc
No ratings yet
1.1 Adc
28 pages
1 Intro 2 Up
No ratings yet
1 Intro 2 Up
16 pages
Lecture1 Intro To DBMS
No ratings yet
Lecture1 Intro To DBMS
32 pages
ElektroModulatorDva 1 1 1
No ratings yet
ElektroModulatorDva 1 1 1
31 pages
Data Modeling and Databases: IDSC 3103
No ratings yet
Data Modeling and Databases: IDSC 3103
24 pages
DSE 310 - Topic 1
No ratings yet
DSE 310 - Topic 1
43 pages
Notesh
No ratings yet
Notesh
246 pages
Week # 01
No ratings yet
Week # 01
42 pages
Monday, March 29, 2004
No ratings yet
Monday, March 29, 2004
32 pages
Introduction To Information Systems SSC, Semester 6
No ratings yet
Introduction To Information Systems SSC, Semester 6
35 pages
Applied Databases: Please Consult The Web Page For Updates, Course Material, Etc
No ratings yet
Applied Databases: Please Consult The Web Page For Updates, Course Material, Etc
21 pages
Lecture#01 Introduction
No ratings yet
Lecture#01 Introduction
55 pages
Dbms Complete Interview Guide
No ratings yet
Dbms Complete Interview Guide
130 pages
CPS 116 Introduction To Database Systems
No ratings yet
CPS 116 Introduction To Database Systems
10 pages
Antim Prahar 2025 Data Base Management System
No ratings yet
Antim Prahar 2025 Data Base Management System
58 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
15 pages
Module 1
No ratings yet
Module 1
77 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
19 pages
CSC 220 Dbase
No ratings yet
CSC 220 Dbase
27 pages
ACMP 351Nf
No ratings yet
ACMP 351Nf
59 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
29 pages
Relational Databases: Preface Xvii
No ratings yet
Relational Databases: Preface Xvii
9 pages
Handouts PDF
No ratings yet
Handouts PDF
293 pages
CIS - DMS - 311 - CLASS1 and 2
No ratings yet
CIS - DMS - 311 - CLASS1 and 2
45 pages
"Advanced Database Systems": Course Outlines
No ratings yet
"Advanced Database Systems": Course Outlines
23 pages
Chapter 1 DBMS DJSCE
No ratings yet
Chapter 1 DBMS DJSCE
27 pages
Week-1 LECTURE Intro To Module Database and Database Systems
No ratings yet
Week-1 LECTURE Intro To Module Database and Database Systems
72 pages
Lecture - 1 Introduc - On To Database System: CS344 S. Ranbir Singh
No ratings yet
Lecture - 1 Introduc - On To Database System: CS344 S. Ranbir Singh
66 pages
Appgcet2024 Computer Science
No ratings yet
Appgcet2024 Computer Science
2 pages
Dbms Intro
No ratings yet
Dbms Intro
34 pages
Introduction To Dbms
No ratings yet
Introduction To Dbms
37 pages
Introduction To CS 4604: Zaki Malik August 26, 2007
No ratings yet
Introduction To CS 4604: Zaki Malik August 26, 2007
17 pages
Chap 1
No ratings yet
Chap 1
3 pages
Lecture 01
No ratings yet
Lecture 01
39 pages
CS6303 - Database Management Systems (DBMS) Sasuri
No ratings yet
CS6303 - Database Management Systems (DBMS) Sasuri
185 pages
DBMS Unit1
No ratings yet
DBMS Unit1
30 pages
Part One Relational Databases
No ratings yet
Part One Relational Databases
9 pages
Nissan Murano (Z50) 2002-2007 Audio, Visual, Navigation and Telephone System
100% (1)
Nissan Murano (Z50) 2002-2007 Audio, Visual, Navigation and Telephone System
148 pages
DBMSC 03 Co 4 NOtes
No ratings yet
DBMSC 03 Co 4 NOtes
3 pages
Lecture No. 1 PDF
No ratings yet
Lecture No. 1 PDF
57 pages
Introduction To Database Systems: Ruoming Jin TTH 9:15 - 10:30pm Spring 2009 RM MSB115
No ratings yet
Introduction To Database Systems: Ruoming Jin TTH 9:15 - 10:30pm Spring 2009 RM MSB115
54 pages
Fujitsu Inverter Ac Service Manual
No ratings yet
Fujitsu Inverter Ac Service Manual
18 pages
CP 224: Database Management Systems: Instructors
No ratings yet
CP 224: Database Management Systems: Instructors
22 pages
Course01 - Introduction in Databases
No ratings yet
Course01 - Introduction in Databases
31 pages
INFO445: Advanced Database Design, Management, and Maintenance
No ratings yet
INFO445: Advanced Database Design, Management, and Maintenance
39 pages
Android Project Report
57% (7)
Android Project Report
38 pages
Data Models: Preface XV
No ratings yet
Data Models: Preface XV
8 pages
Sensors: SA6A: Ultrasonic Analog Distance Detection Sensors
No ratings yet
Sensors: SA6A: Ultrasonic Analog Distance Detection Sensors
5 pages
Sterling Power Products
No ratings yet
Sterling Power Products
16 pages
Data Link Protocols: Unrestricted Simplex Protocol Simplex Stop-And-Wait Protocol Simplex Protocol For A Noisy Channel
No ratings yet
Data Link Protocols: Unrestricted Simplex Protocol Simplex Stop-And-Wait Protocol Simplex Protocol For A Noisy Channel
6 pages
Learn Arduino Sensor With Projects For Beginners
No ratings yet
Learn Arduino Sensor With Projects For Beginners
85 pages
Arduino Cheat Sheet PDF
No ratings yet
Arduino Cheat Sheet PDF
1 page
Bit Scope Programming
No ratings yet
Bit Scope Programming
16 pages
Power System Operation & Control
No ratings yet
Power System Operation & Control
73 pages
Exam Tutorial - Adobe Premiere Pro
100% (1)
Exam Tutorial - Adobe Premiere Pro
32 pages
AFM Manual
No ratings yet
AFM Manual
402 pages
The 8051 Assembly Language
No ratings yet
The 8051 Assembly Language
89 pages
Temp Controller CN7200 - SERIES
No ratings yet
Temp Controller CN7200 - SERIES
1 page
Salient Check-Out Procedure Checklist: 3.1 General Visual Inspection
No ratings yet
Salient Check-Out Procedure Checklist: 3.1 General Visual Inspection
5 pages
Baghmara
No ratings yet
Baghmara
2 pages
S & S 2 Marks
No ratings yet
S & S 2 Marks
30 pages
The Nodes Need To Remember Their Addresses Identify The Links To Which They Are Attached
No ratings yet
The Nodes Need To Remember Their Addresses Identify The Links To Which They Are Attached
13 pages
Lecture17 NetworkResourceAllocation
No ratings yet
Lecture17 NetworkResourceAllocation
12 pages
EETE 231 Digital Electronics E 46 Final Assignment
No ratings yet
EETE 231 Digital Electronics E 46 Final Assignment
2 pages
Facade Cleaning Robot Info
No ratings yet
Facade Cleaning Robot Info
5 pages
GardiaR300L UM EN
No ratings yet
GardiaR300L UM EN
16 pages
TA0971A
No ratings yet
TA0971A
5 pages
S M29
No ratings yet
S M29
9 pages
218.wireless Power Transfer by High Frequency Resonating Coils
No ratings yet
218.wireless Power Transfer by High Frequency Resonating Coils
1 page
FIBERCORE SM Series Single Mode Fiber Spec Sheet
No ratings yet
FIBERCORE SM Series Single Mode Fiber Spec Sheet
5 pages
Inter-Domain Routing Basics: Exterior Routing Protocols Created To
No ratings yet
Inter-Domain Routing Basics: Exterior Routing Protocols Created To
14 pages
Congestion Control: Issues
No ratings yet
Congestion Control: Issues
7 pages
HH Glickman-Jj
No ratings yet
HH Glickman-Jj
5 pages
Lecture 8
No ratings yet
Lecture 8
11 pages
01 - Crane Control
No ratings yet
01 - Crane Control
80 pages
Final - Report Jayanth
No ratings yet
Final - Report Jayanth
88 pages
Web Caches, CDNS, and P2Ps
No ratings yet
Web Caches, CDNS, and P2Ps
7 pages
1 Post Notes
No ratings yet
1 Post Notes
7 pages
Medium Access Control
No ratings yet
Medium Access Control
8 pages
DJ Controller: Operating Instructions
No ratings yet
DJ Controller: Operating Instructions
32 pages
Lecture16 TCPOverview
No ratings yet
Lecture16 TCPOverview
12 pages
Length of A Curve and Surface Area
No ratings yet
Length of A Curve and Surface Area
12 pages
Recent Advances in Routing Architecture Including: Line Cards
No ratings yet
Recent Advances in Routing Architecture Including: Line Cards
11 pages
33 Post Notes
No ratings yet
33 Post Notes
11 pages
iOS-Tutorial-Lecture 12 Slides
No ratings yet
iOS-Tutorial-Lecture 12 Slides
19 pages
10 Post Notes
No ratings yet
10 Post Notes
9 pages
Line Encoding: Line Encoding Converts A Binary Information Sequence To Digital Signal
No ratings yet
Line Encoding: Line Encoding Converts A Binary Information Sequence To Digital Signal
8 pages
18 Post Notes
No ratings yet
18 Post Notes
8 pages
Optimization Problems
No ratings yet
Optimization Problems
8 pages
Solving Equations Numerically: 21B Numerical Solutions
No ratings yet
Solving Equations Numerically: 21B Numerical Solutions
8 pages
24 Post Notes
No ratings yet
24 Post Notes
8 pages
22 Post Notes
No ratings yet
22 Post Notes
8 pages
2.1B Riorous Study of Limits
No ratings yet
2.1B Riorous Study of Limits
7 pages
What Is Direct Link Networks?
No ratings yet
What Is Direct Link Networks?
6 pages
3 Post Notes
No ratings yet
3 Post Notes
6 pages
Differentials and Approximations
No ratings yet
Differentials and Approximations
6 pages
29 Post Notes
No ratings yet
29 Post Notes
6 pages
The First Fundamental Theorem of Calculus
No ratings yet
The First Fundamental Theorem of Calculus
6 pages
12 Post Notes
No ratings yet
12 Post Notes
6 pages
23 Post Notes
No ratings yet
23 Post Notes
6 pages
19 Post Notes
No ratings yet
19 Post Notes
5 pages
Ufone Packages
No ratings yet
Ufone Packages
8 pages
Computer Science I Essentials
From Everand
Computer Science I Essentials
Randall Raus
5/5 (7)
Jump Start MySQL: Master the Database That Powers the Web
From Everand
Jump Start MySQL: Master the Database That Powers the Web
Timothy Boronczyk
No ratings yet

Final Lec

Uploaded by

Final Lec

Uploaded by

Administrivia

Database Lessons to Live By

Recall Lecture 1!!

Bulk Processing & I/O Go

Optimize the Memory

Applied Algorithm Analysis

Not enough memory?

Declarative languages are

SQL: The good, the bad, the

Query Operators &

CC & Recovery: House

The Rebirth of Information

Databases: The natural way

Google convinced people that bulk data analysis is cool

More, more, Im still not

undertake a research project -- often big successes!

But wait, theres more!

Tons of DB-related companies, lots of hiring

You might also like