0% found this document useful (0 votes)
30 views

Database Management Systems Week 1

Database Management Systems

Uploaded by

matt.0.porter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Database Management Systems Week 1

Database Management Systems

Uploaded by

matt.0.porter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Adapted for SEN2104 – DBMS

Week 1: Introduction

Database System Concepts


©Silberschatz, Korth and Sudarshan
Database Management System (DBMS)

▪ DBMS contains information about a particular enterprise


▪ Collection of interrelated data
▪ Set of programs to access the data
▪ An environment that is both convenient and efficient to use
▪ Database Applications:
▪ Banking: transactions
▪ Airlines: reservations, schedules
▪ Universities: registration, grades
▪ Sales: customers, products, purchases
▪ Online retailers: order tracking, customized recommendations
▪ Manufacturing: production, inventory, orders, supply chain
▪ Human resources: employee records, salaries, tax deductions
▪ Databases can be very large.
▪ Databases touch all aspects of our lives

2
University Database Example
▪ Application program examples
▪ Add new students, instructors, and courses
▪ Register students for courses, and generate class rosters
▪ Assign grades to students, compute grade point
averages (GPA) and generate transcripts
▪ In the early days, database applications were built directly on
top of file systems

3
Drawbacks of using file systems to store data

▪ Data redundancy and inconsistency


▪ Multiple file formats, duplication of information in different files
▪ Difficulty in accessing data
▪ Need to write a new program to carry out each new task
▪ Data isolation
▪ Multiple files and formats
▪ Integrity problems
▪ Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
▪ Hard to add new constraints or change existing ones

4
Drawbacks of using file systems to store data (Cont.)

▪ Atomicity of updates
▪ Failures may leave database in an inconsistent state with partial
updates carried out
▪ Example: Transfer of funds from one account to another should
either complete or not happen at all
▪ Concurrent access by multiple users
▪ Concurrent access needed for performance
▪ Uncontrolled concurrent accesses can lead to inconsistencies
▪ Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the same
time
▪ Security problems
▪ Hard to provide user access to some, but not all, data

Database systems offer solutions to all the above problems

5
Relational Model
▪ All the data is stored in various tables.
▪ Example of tabular data in the relational model Columns

Rows

6
A Sample Relational Database

7
Database Design
The process of designing the general structure of the database:

▪ Logical Design – Deciding on the database schema.


Database design requires that we find a “good” collection of
relation schemas.
▪ Business decision – What attributes should we record in
the database?
▪ Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
▪ Physical Design – Deciding on the physical layout of the
database

8
Database Design (Cont.)
▪ Is there any problem with this relation?

9
Design Approaches
▪ Need to come up with a methodology to ensure that each of the
relations in the database is “good”
▪ Two ways of doing so:
▪ Entity Relationship Model
▪ Models an enterprise as a collection of entities and
relationships
▪ Represented diagrammatically by an entity-relationship
diagram:
▪ Normalization Theory
▪ Formalize what designs are bad, and test for them

10
Data Definition Language (DDL)
▪ Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
▪ DDL compiler generates a set of table templates stored in a data dictionary
▪ Data dictionary contains metadata (i.e., data about data)
▪ Database schema
▪ Integrity constraints
▪ Primary key (ID uniquely identifies instructors)
▪ Authorization
▪ Who can access what

11
Data Manipulation Language (DML)
▪ Language for accessing and manipulating the data organized
by the appropriate data model
▪ DML also known as query language
▪ Two classes of languages
▪ Pure – used for proving properties about computational
power and for optimization
▪ Relational Algebra
▪ Tuple relational calculus
▪ Domain relational calculus
▪ Commercial – used in commercial systems
▪ SQL is the most widely used commercial language

12
SQL

▪ The most widely used commercial language


▪ To be able to compute complex functions SQL is usually
embedded in some higher-level language
▪ Application programs generally access databases through one of
▪ Language extensions to allow embedded SQL
▪ Application program interface (e.g., ODBC/JDBC) which allow
SQL queries to be sent to a database

13
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation

14
Query Processing (Cont.)
▪ Alternative ways of evaluating a given query
▪ Equivalent expressions
▪ Different algorithms for each operation
▪ Cost difference between a good and a bad way of evaluating a
query can be enormous
▪ Need to estimate the cost of operations
▪ Depends critically on statistical information about relations
which the database must maintain
▪ Need to estimate statistics for intermediate results to compute
cost of complex expressions

15
Data Analysis

▪ Decision-support systems are used to make business decisions,


often based on data collected by on-line transaction-processing
systems.
▪ Examples of business decisions:
▪ What items to stock?
▪ What insurance premium to change?
▪ To whom to send advertisements?
▪ Examples of data used for making decisions
▪ Retail sales transaction details
▪ Customer profiles (income, age, gender, etc.)

16
Data Analysis (Cont.)
▪ Data analysis tasks are simplified by specialized tools and SQL
extensions
▪ Example tasks
▪ For each product category and each region, what were the
total sales in the last quarter and how do they compare with
the same quarter last year
▪ As above, for each product category and each customer
category
▪ Statistical analysis packages (e.g., : S++) can be interfaced with
databases
▪ Statistical analysis is a large field, but not covered here
▪ Data mining seeks to discover knowledge automatically in the form of
statistical rules and patterns from large databases.
▪ A data warehouse archives information gathered from multiple
sources, and stores it under a unified schema, at a single site.
▪ Important for large businesses that generate data from multiple
divisions, possibly at multiple sites
▪ Data may also be purchased externally

17
History of Database Systems
• 1950s and early 1960s:
▪ Data processing using magnetic tapes for storage
▪ Tapes provided only sequential access
▪ Punched cards for input
• Late 1960s and 1970s:
▪ Hard disks allowed direct access to data
▪ Network and hierarchical data models in widespread use
▪ Ted Codd defines the relational data model
▪ Would win the ACM Turing Award for this work
▪ IBM Research begins System R prototype
▪ UC Berkeley (Michael Stonebraker) begins Ingres prototype
▪ Oracle releases first commercial relational database
▪ High-performance (for the era) transaction processing

18
History of Database Systems (Cont.)

• 1980s:
▪ Research relational prototypes evolve into commercial systems
▪SQL becomes industrial standard
▪ Parallel and distributed database systems
▪Wisconsin, IBM, Teradata
▪ Object-oriented database systems
• 1990s:
▪ Large decision support and data-mining applications
▪ Large multi-terabyte data warehouses
▪ Emergence of Web commerce

19
History of Database Systems (Cont.)

• 2000s
▪ Big data storage systems
▪ Google BigTable, Yahoo PNuts, Amazon,
▪ “NoSQL” systems.
▪ Big data analysis: beyond SQL
▪ Map reduce and friends
• 2010s
▪ SQL reloaded
▪ SQL front end to Map Reduce systems
▪ Massively parallel database systems
▪ Multi-core main-memory databases

20

You might also like