Is
Is
Course Materials
• Required: A First Course in Database
Systems, (2nd Edition), Jeffrey Ullman and
Jennifer Widom, Prentice Hall, 2002.
• Recommended: Database Management
Systems, Third Edition, Raghu Ramakrishnan
and Johannes Gehrke, McGraw-Hill, 2002.
Grading
• Homework (4-5 sets) 20%
• Projects 30%
– Use Microsoft Access to design a database in two
projects.
– The first project is on the entity-relational (ER)
model,
– The second project is on relational algebra (RA)
and relational calculus (RC).
• Final 25%
– Exams in-class, closed-book, non-cumulative
• Late policy: 10% each day after the due date
• No cheating
Communication
• Web page:
http://www.cs.nwu.edu/~ychen/classes/cs317/
• Recitation: Tu, Th or Fri? 5-6pm, Room 381,
1890 Maple.
– TA lectures on the homework and projects, and help
to prepare the exams.
• Newsgroup are available
– cs.317 (course announcement, and posting Q
& A)
• Send emails to instructor and TA for questions
inappropriate in newsgroup
• Course outline (see it online)
What Is a Database System?
• Database:
a very large, integrated collection of
data.
• Models a real-world enterprise
– Entities (e.g., teams, games)
– Relationships
(e.g., The Forty-Niners are playing in The Superbowl)
– More recently, also includes active components ,
often called “business logic”. (e.g., the BCS ranking
system)
• What if you
wanted to find
out which actors
donated to John
Kerry’s
presidential
campaign?
• Try “actors
donated to john
kerry” in your
favorite search
engine.
A “Database Query” Approach
Is a File System a
= DBMS?
• Thought Experiment 1:
– You and your project partner are editing the
same file.
– You both save it at the same time.
– Whose changes survive?
A) Yours B) Partner’s C) Both D) Neither E) ???
•Thought Experiment 2: Q: How do you write
programs over a
–You’re updating a file.
subsystem when it
–The power goes out. promises you only “???” ?
–Which of your changes survive?
A: Very, very carefully!!
A) All B) None C) All Since Last Save D) ???
Current Commercial Outlook
• A major part of the software industry:
– Oracle, IBM, Microsoft, Sybase
– also Informix (now IBM), Teradata
– smaller players: java-based dbms, devices, OO, …
• Well-known benchmarks (esp. TPC)
• Lots of related industries
– data warehouse, document management, storage,
backup, reporting, business intelligence, app
integration
• Relational products dominant and
evolving
– adapting for extensibility (user-defined types),
adding native XML support.
• Open Source coming on strong
– MySQL, PostgreSQL, BerkeleyDB
Why Study Databases?? ?
Physical Schema
• Physical schema
describes the files and DB
indexes used.
• Physical data
independence: DB
Protection from changes
in physical structure of
data.
Queries, Query Plans, and Operators
Count
Having
distinct
SELECT eid,
SELECT E.loc,
ename,
AVG(E.sal)
title
π
π
COUNT DISTINCT (E.eid)
π
FROM Emp E
FROM Emp E,E.loc
Proj P, Asgn A
WHERE
GROUP BY
E.sal > $50K
WHERE E.eid = A.eid
Group(agg)
HAVING Count(*) > 5
AND P.pid = A.pid Join
Select
AND E.loc <> P.loc
Join
π Proj
Emp
Emp Emp
Asgn
• System handles query
plan generation &
optimization; ensures Employees
correct execution. Projects
Assignments
DB
Advantages of a DBMS
• Data independence
• Efficient data access
• Data integrity & security
• Data administration
• Concurrent access, crash recovery
• Reduced application development time
• So why not use them always?
– Expensive/complicated to set up & maintain
– This cost & complexity must be offset by need
– General-purpose, not suited for special-purpose tasks (e.g.
text search!)
Databases make these folks
happy ...
• DBMS vendors, programmers
– Oracle, IBM, MS, Sybase, …
• End users in many fields
– Business, education, science, …
• DB application programmers
– Build enterprise applications on top of DBMSs
– Build web services that run off DBMSs
• Database administrators (DBAs)
– Design logical/physical schemas
– Handle security and authorization
– Data availability, crash recovery
– Database tuning as needs evolve