0% found this document useful (0 votes)
71 views14 pages

Lecture 2

This document provides an overview of database management systems (DBMS). It defines key terms like data, database, information and metadata. It explains that a DBMS is a tool for creating, managing and manipulating large amounts of data efficiently over long periods of time. It also discusses some properties a good DBMS should have, like being atomic, consistent, isolated and durable (ACID properties). Finally, it outlines some common applications of database systems and different data models like hierarchical, network and relational models.

Uploaded by

Shabana Hafeez
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views14 pages

Lecture 2

This document provides an overview of database management systems (DBMS). It defines key terms like data, database, information and metadata. It explains that a DBMS is a tool for creating, managing and manipulating large amounts of data efficiently over long periods of time. It also discusses some properties a good DBMS should have, like being atomic, consistent, isolated and durable (ACID properties). Finally, it outlines some common applications of database systems and different data models like hierarchical, network and relational models.

Uploaded by

Shabana Hafeez
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14

Course Outline

DBMS

| 
 |

Definitions
‡ Data: Meaningful facts, text, graphics,
‡ images, sound, video segments
‡ Database: An organized collection of
logically related data
‡ Information: Data processed to be
useful in decision making
‡ Metadata: Properties /characteristics
that describes data
| 
 |

4hat is a database (DB)?
‡ A collection of data that exists over a
long period of time, often many years¶
‡ Managed through a database
management system

| 
 |

4hat is a database management
system (DBMS)?
‡ A powerful tool for creating and
managing [and manipulating] large
amounts of data [(several gigabytes)]
efficiently and allowing it to persist
over long periods of time, safely
‡ Focus on secondary, rather than main,
memory
‡ Powerful, but simple, programming
interface
| 
 |

A Simple Data Management
Problem
Suppose we want to save Names, Phone
Numbers«
‡ Solution 1 (Paper based)
² A blank notebook OR a phone/address book
² Entries recorded by pen, in time order
‡ Advantages
² Cheap, simple, private, reliable, space efficient
‡ Disadvantages
² Hard to search, update, share, expand
² Hard to add information, e.g. email addresses

| 
 |

Another approach
The Traditional File
Processing Environment
Use of Note Pad, Ms. 4ord, MS Excel

| 
 |

DBMS vs. `just a file system'
‡ DBMS's evolved from file systems
‡ file systems also store large amounts of data
over a long period of time in secondary
memory
‡ however, file systems
² can lack efficient access
² have no direct support for queries
² limit organization to directory creation and
hierarchical organization
² have no sophisticated support for concurrency
² do not ensure durability
| 
 |

ACID properties
( All good DBMS's should guarantee these )
‡ Atomicity
² should not be able to execute half of an operation
² either all or none of the effects of a transaction are made permanent
‡ Consistency
² The consistency property ensures that any transaction the database performs will take it from one
consistent state to another.
² there should be no surprises in the world, e.g., gpa > 4.0, balance < 0, cats should never have more
than 1 tail!
² the effect of concurrent transactions is equivalent to some serial execution
² use constraints, triggers, active DB elements (context-free)
‡ Isolation
² Isolation refers to the requirement that other operations cannot access data that has been
modified during a transaction that has not yet completed.
² concurrency control
² transactions should not be able to observe the partial effects of other transactions
² use locks (whole relations or individual tuples?)
‡ Durability
² Durability is the ability of the DBMS to recover the committed transaction updates against any kind
of system failure (hardware or software).
² if power goes out, nothing bad should happen
² once accepted, the effects of a transaction are permanent (until, of course, changed by another
transaction)
² use logs

| 
 |

Applications of database
systems
‡ reservation systems, banking systems
‡ Network simulations / Experiments
‡ record/book keeping (corporate, university, medical), statistics
‡ bioinformatics, e.g., gene databases
‡ criminal justice
² fingerprint matching
² how do you encode `looks like'?
‡ multimedia systems
² require terabytes (1012 bytes) of storage
² tertiary storage devices, e.g., CD, DVDs
² image/audio/video retrieval
² streaming, interactivity
‡ satellite imaging; can require petabytes (1015 bytes) of storage
‡ the web
² client-server and multi-tier architectures
² almost all data-intensive websites are database-driven; IMDB.com is an exception
‡ information integration
² over the web
² legacy systems; must deal with issues of
‡ synonymy: different words having the same meaning, e.g., coffee shop vs. café
‡ polysemy: same word (homonym) having different meanings, e.g., shot
² data warehouses
² data mining (KDD, Knowledge Discovery in Databases), e.g., association rules: `diapers beer'; we pass these on to
the marketing folks
‡ in sum, databases are everywhere!

| 
 |

Three classical data models
‡ hierarchical model
‡ network model
² each tuple is a separate record,
² no separation between logical and physical views
² used record-at-a-time languages
² too low-level
‡ relational model
² most popular and successful model
² de facto standard for databases
² (relational) databases are one of the most popular success stories of
simple theoretical ideas
‡ semistructured data and XML
² semistructured data is self-describing
² web data tends to be semistructured
² in between structured and unstructured data (free text)
² the study of the storage and retrieval of unstructured data is called IR
(Information Retrieval)

| 
 |

Main themes of relational database
management systems (RDBMS's)
‡ data stored in a relation (for now, a table), e.g., a simple relation

‡ attributes (columns), tuples (rows, records)


‡ 2-tuple = pair, 3-tuple = triple, m-tuple

‡ clean separation between logical and physical views, ANSI Sparc


architecture, 3-tier organization of databases (layers of abstraction)

Views

Relations

physical storage

| 
 |

Contd««
‡ gives rise to powerful, yet declarative, relation-at-a-
time query languages, e.g., SQL (Structured Query
Language; pronounced `sequel')
‡ a simple SQL query illustrating the SELECT-FROM-
4HERE construct
² SELECT id FROM Students 4HERE major = 'CPS' AND GPA
> 3.7;
‡ relational query languages (QLs) are declarative
² you specify what you want, not how to get it (à la PROLOG)
² e.g., SQL
‡ closure property

| 
 |

How can we study database
systems?
‡ design of databases, i.e., how do you
structure your data in a database?
² entity-relationship (E/R) model
² relational model

‡ database programming
² how do you use a DBMS?
² study (query languages) such as SQL

‡ database system implementation, i.e., how


do you build the next Oracle?
| 
 |

Reference
‡ http://academic.udayton.edu/SaverioP
erugini/courses/cps430/lecture_notes/i
ntroduction/intro.html
‡ J.D. Ullman and J. 4idom. A First
Course in Database Systems. Prentice
Hall, Upper Saddle River, NJ, Second
edition, 2002.

| 
 |

You might also like