0% found this document useful (0 votes)
47 views35 pages

Introduction To Information Systems SSC, Semester 6

This document provides an introduction to an information systems course being taught at EPFL. It outlines the instructors, communication methods, textbooks, and course materials. The course aims to teach relational database management systems with an emphasis on web applications. It will build upon a database course from the University of Washington, adapting some of the slides, exercises, and project ideas. The outline discusses what database management systems are, why they are useful compared to file-based data storage, and basic concepts like transactions.

Uploaded by

tbengua
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views35 pages

Introduction To Information Systems SSC, Semester 6

This document provides an introduction to an information systems course being taught at EPFL. It outlines the instructors, communication methods, textbooks, and course materials. The course aims to teach relational database management systems with an emphasis on web applications. It will build upon a database course from the University of Washington, adapting some of the slides, exercises, and project ideas. The outline discusses what database management systems are, why they are useful compared to file-based data storage, and basic concepts like transactions.

Uploaded by

tbengua
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Introduction to

Information Systems
SSC, Semester 6

Lecture 01

1
Staff
• Instructors:
– Karl Aberer, BC 108, karl aberer at epfl ch
– Philippe Cudré-Mauroux, BC 114, philippe cudre-mauroux
at epfl ch
– Office hours: by appointment
• TAs:
– Gleb Skobeltsyn (exercises)
– Martin Rubli (project)

2
Communications
• Web page: lsirww.epfl.ch
– Lectures will be available here
– Homeworks and solutions will be posted here
– The project description and resources will be here

• Newsgroup:
– epfl.ic.cours.IIS

3
Textbook
Main textbook:

• Databases and Transaction Processing,


An application-oriented approach
Philip M. Lewis, Arthur Bernstein, Michael
Kifer, Addison-Wesley 2002.

4
Other Texts
Many classic textbooks (each of them will do it)
• Database Systems: The Complete Book, Hector Garcia-Molina,
Jeffrey Ullman, Jennifer Widom
• Database Management Systems, Ramakrishnan
• Fundamentals of Database Systems, Elmasri, Navathe
• Database Systems, Date (7. edition)
• Modern Database Management, Hoffer, (4. edition)
• Database Systems Concepts, Silverschatz, (4. edition)

5
Material on the Web
SQL Intro
• SQL for Web Nerds, by Philip Greenspun,
http://philip.greenspun.com/sql/
Java Technology
• java.sun.com
WWW Technology
• www.w3c.org
6
The Course
• Goal: Teaching RDBMS (standard) with a
strong emphasis on the Web
• Fortunately others already did it
– Alon Halevy, Dan Suciu, Univ. of Washington
– http://www.cs.washington.edu/education/course
s/cse444/
– http://www.acm.org/sigmod/record/issues/0309
/4.AlonLevy.pdf
7
Acknowledgement
• Build on UoW course
– many slides
– many exercise
– ideas for the project
• Main difference
– less theory
– will use real Web data in the project

8
Outline for Today’s Lecture
• Overview of database systems
• Course Outline
• First Steps in SQL

9
What is behind this Web Site?
• http://immo.search.ch/
• Search on a large database
• Specify search conditions
• Many users
• Updates
• Access through a web interface

10
Database Management Systems
Database Management System = DBMS
• A collection of files that store the data
• A big C program written by someone else
that accesses and updates those files for you
Relational DBMS = RDBMS
• Data files are structured as relations (tables)

11
Where are RDBMS used ?
• Backend for traditional “database”
applications
– EPFL administration
• Backend for large Websites
– Immosearch
• Backend for Web services
– Amazon

12
Example of a Traditional
Database Application
Suppose we are building a system
to store the information about:
• students
• courses
• professors
• who takes what, who teaches what

13
Can we do it without a DBMS ?
Sure we can! Start by storing the data in files:

students.txt courses.txt professors.txt

Now write C or Java programs to implement


specific tasks
14
Doing it without a DBMS...
• Enroll “Mary Johnson” in “CSE444”:
Write a C/Java program to do the following:
Read
Read ‘students.txt’
‘students.txt’
Read
Read ‘courses.txt’
‘courses.txt’
Find&update
Find&update the
the record
record “Mary
“Mary Johnson”
Johnson”
Find&update
Find&update the
the record
record “CSE444”
“CSE444”
Write
Write “students.txt”
“students.txt”
Write
Write “courses.txt”
“courses.txt”
15
Problems without an DBMS...
• System crashes: Read ‘students.txt’
Read ‘students.txt’
Read ‘courses.txt’
Read ‘courses.txt’
Find&update
Find&updatethe
Find&update
therecord
the
record“Mary
record
“MaryJohnson”
Johnson”
“CSE444”
Find&update the record “CSE444”
CRASH !
Write
Write“students.txt”
“students.txt”
Write “courses.txt”
Write “courses.txt”

– What is the problem ?


• Large data sets (say 50GB)
– Why is this a problem ?
• Simultaneous access by many users
– Lock students.txt – what is the problem ?
16
Enters a DBMS
“Two tier system” or “client-server”

connection
(ODBC, JDBC)

Database server
(someone else’s
Data files C program) Applications
17
Functionality of a DBMS
The programmer sees SQL, which has two components:
• Data Definition Language - DDL
• Data Manipulation Language - DML
– query language

Behind the scenes the DBMS has:


• Query engine
• Query optimizer
• Storage management
• Transaction Management (concurrency, recovery)
18
How the Programmer Sees the
DBMS
• Start with DDL to create tables:
CREATE
CREATETABLE TABLEStudents
Students((
Name
NameCHAR(30)
CHAR(30)
SSN
SSNCHAR(9)
CHAR(9)PRIMARY
PRIMARYKEY
KEYNOT
NOTNULL,
NULL,
Category
CategoryCHAR(20)
CHAR(20)
)) . .. .. .

• Continue with DML to populate tables:


INSERT
INSERTINTO INTOStudents
Students
VALUES(‘Charles’,
VALUES(‘Charles’,‘123456789’,
‘123456789’,‘undergraduate’)
‘undergraduate’)
.. .. .. ..

19
How the Programmer Sees the
DBMS
• Tables:
Students: Takes:
SSN Name Category SSN CID
123-45-6789 Charles undergrad 123-45-6789 CSE444
234-56-7890 Dan grad 123-45-6789 CSE444
… … 234-56-7890 CSE142
Courses: …
CID Name Quarter
CSE444 Databases fall
CSE541 Operating systems winter
• Still implemented as files, but behind the scenes can
be quite complex
“data independence” = separate logical view
from physical implementation 20
Transactions
• Enroll “Mary Johnson” in “CSE444”:
BEGIN
BEGINTRANSACTION;
TRANSACTION;
INSERT
INSERTINTO
INTOTakes
Takes
SELECT
SELECTStudents.SSN,
Students.SSN,Courses.CID
Courses.CID
FROM
FROMStudents,
Students,Courses
Courses
WHERE
WHEREStudents.name
Students.name==‘Mary
‘MaryJohnson’
Johnson’and
and
Courses.name
Courses.name==‘CSE444’
‘CSE444’
----More
Moreupdates
updateshere....
here....
IF
IFeverything-went-OK
everything-went-OK
THEN
THENCOMMIT;
COMMIT;
ELSE
ELSEROLLBACK
ROLLBACK
21
If system crashes, the transaction is still either committed or aborted
Transactions
• A transaction = sequence of statements that either
all succeed, or all fail
• Transactions have the ACID properties:
A = atomicity (a transaction should be done or undone completely )
C = consistency (a transaction should transform a system from one
consistent state to another consistent state)
I = isolation (each transaction should happen independently of other
transactions )
D = durability (completed transactions should remain permanent)

22
Queries
• Find all courses that “Mary” takes

SELECT
SELECT C.name
C.name
FROM
FROM Students
Students S,
S, Takes
Takes T,
T, Courses
Courses CC
WHERE
WHERE S.name=“Mary”
S.name=“Mary” andand
S.ssn
S.ssn ==T.ssn
T.ssn and
and T.cid
T.cid==C.cid
C.cid
• What happens behind the scene ?
– Query processor figures out how to answer the
query efficiently.
23
Queries, behind the scene
Declarative SQL query Imperative query execution plan:
sname

SELECT
SELECT C.name
C.name
FROM
FROMStudents
StudentsS,
S,Takes
TakesT,
T,Courses
CoursesCC
WHERE
WHERES.name=“Mary”
S.name=“Mary”andand cid=cid

S.ssn
S.ssn==T.ssn
T.ssnand
andT.cid
T.cid==C.cid
C.cid
sid=sid

name=“Mary”

Students Takes Courses

The optimizer chooses the best execution plan for a query 24


Database Systems
• The big commercial database vendors:
– Oracle
– IBM (with DB2)
– Microsoft (SQL Server)
– Sybase
• Some free database systems (Unix) :
– Postgres
– MySQL
– Predator

25
Databases and the Web
• Accessing databases through web interfaces
– Java programming interface (JDBC)
– Embedding into HTML pages (JSP)
– Access through http protocol (Web Services)
• Using Web document formats for data
definition and manipulation
– XML, Xquery, Xpath
– XML databases and messaging systems
26
Database Integration
• Combining data from different databases
– collection of data (wrapping)
– combination of data and generation of new views on the
data (mediation)
• Problem: heterogeneity
– access, representation, content
• Example revisited
– http://immo.search.ch/
– http://www.swissimmo.ch
27
Other Trends in Databases
• Industrial
– Object-relational databases
– Main memory database systems
– Data warehousing and mining
• Research
– Peer-to-peer data management
– Stream data management
– Mobile data management

28
Course Outline
(Details on the Web)
Part I
• SQL (Chapter 6)
• The relational data model (Chapter 3)
• Database design (Chapters 2, 3, 7)
• XML, XPath, XQuery
Part II
• Indexes (Chapter 13)
• Transactions and Recovery (Chapter 17 - 18)
Exam

29
Structure
• Prerequisites:
– Programming courses
– Data structures
• Work & Grading:
– Homeworks (4): 0%
– Exam (like homeworks): 50%
– Project: 50% (see next) – each phase graded separately

30
The Project
• Models the real data management needs of a Web
company
– Phase 1: Modelling and Data Acquisition
– Phase 2: Data integration and Applications
– Phase 3: Services
• "One can only start to appreciate database systems
by actually trying to use one" (Halevy)
• Any SW/IT company will love you for these skills 

31
The Project – Side Effects
• Trains your soft skills
– team work
– deal with bugs, poor documentation, …
– produce with limited time resources
– project management and reporting
• Results useful for you personally
– Demo:
• Project should be fun 
32
Practical Concerns
• New course, expect some hickups
• Important to keep time schedule
• Communication through Web
• Newsgroup
• Student committee for regular feedback
(2 volunteers)

33
Week Date Lecture Exercise Project Deadlines
1 11.03.2005 Introduction, Basic SQL Project Presentation
2 18.03.2005 Advanced SQL Ex. 1: SQL (on machines)
25.03.2005 Easter
01.04.2005 Easter
3 08.04.2005 Conceputal Modelling Correction Ex. 1
4 15.04.2005 Database Programming
5 22.04.2005 Functional Dependencies Ex. 2: FD and RA Phase 1
6 29.04.2005 Relational Algebra
7 06.05.2005 Introduction to XML Corr. Ex. 2 / Ex. 3: XML
8 13.05.2005 XML Query Phase 2
9 20.05.2005 Web Services Correction Ex. 3
10 27.05.2005 Concurrency Ex. 4: Transactions
11 03.06.2005 Recovery
08.06.2005 Phase 3
12 10.06.2005 Database Heterogeneity Correction Ex. 4
13 17.06.2005 Indexing

34
So what is this course about,
really ?
A bit of everything !
• Languages: SQL, XPath, XQuery
• Data modeling
• Theory ! (Functional dependencies, normal
forms)
• Algorithms and data structures (in the second half)
• Lots of implementation and hacking for the
project
• Most importantly: how to meet Real World needs
35

You might also like