0% found this document useful (0 votes)
7 views49 pages

Chap 1

Uploaded by

siri velpula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views49 pages

Chap 1

Uploaded by

siri velpula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Organization of the Book

Database System Concepts, 7th Edition


Overview
Chapter 1: provides a general overview of the nature and
purpose of database systems
Part 1:Relational Databases
Chapter 2 introduces the relational model of data
Chapter 3 provides a basic introduction to SQL
Chapter 4-5 describes more advanced SQL features
Part 2:Database Design
Chapter 7 provides an overview of the database-design
process
Chapter 8 introduces the theory of relational-database
design.
Chapter 9 covers application design and development
Part 3: Data storage and Querying
Chapter 10 deals with disk, file, and file-system structure
Part 3: Transaction Management
Chapter 14-15 deals with Transactions and Concurrency
control
Part 8: Object-based DB and XML
Chapter 22 covers the Object based databases
Chapter 23 covers the XML standard for data
representation 1
CSCI 5333
Database Management Systems
(DBMS)

Chapter 1: Contents
 Concepts & History of Database System
 Drawbacks of Database in a Text File
 Data Abstraction and Models
 Database Language and Users
 Overall System Architecture 2
Database Management Systems

• Information is possibly the most important


resource an organization can have
– has a LARGE amount of information
– must have some way to define its structure
– must have some way to enforce the constraints
– must store it efficiently in data store
– must be able to manipulate it efficiently
– must be safe and secure
• data reliability(RAID System)
3
Database Management System (DBMS)
• Collection of data =
Database (DB)
• Set of interrelated data
and programs to access Program Program
those data is called
DBMS Program Program

• DBMS Provides
environment that is
convenient and efficient Data Data Data
to use for data retrieval
and storage Database

DBMS 4
History of Database Systems
• 1950s and early 1960s:
– Data processing using magnetic tapes for storage
• Tapes provide only sequential access
– Punched cards for input
• Late 1960s and 1970s:
– Hard disks allow direct access to data
– Network and hierarchical data models in widespread use
– Ted Codd defines the relational data model
• Win the ACM Turing Award for this work
• IBM Research begins System R prototype
• UC Berkeley begins Ingres prototype
– High-performance (for the era) transaction processing
5
History of Database Systems (cont.)
• 1980s:
– Research relational prototypes evolve into commercial systems
• SQL becomes industrial standard
– Parallel and distributed database systems
– Object-oriented database systems
• 1990s:
– Large decision support and data-mining applications
– Large multi-terabyte data warehouses
– Emergence of Web commerce
• Early 2000s:
– XML and XQuery standards
– Automated database administration
• Later 2000s:
– Giant data storage systems
6
• Google BigTable, Yahoo PNuts, Amazon, ..
Database Applications
• DBMS contains information about a particular
enterprise. An environment that is both convenient
and efficient to use
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Online retailers: order tracking, customized
recommendations
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax
deductions
• Databases touch all aspects of our lives 7
Database Management System
• Why are DBMS so important?
• Consider a possible way of storing customer and
saving accounts information:
– text files for the data
– executable programs written in popular programming
languages
PrintBal.exe
AddAC.exe AddCust.exe

SavingAC.txt CustInfo.txt

Debit.exe Credit.exe
8
Difficulties
• Data Redundancy
– Same information duplicated in different files
– higher storage and access costs
• Data Inconsistency
– copies of the same data don’t agree
– which copy is right?

AddAC.exe AddCust.exe
Customer Name Customer Name
Customer Phone Customer Address
Account Num Customer Phone
Amount Customer Email

SavingAC.txt CustInfo.txt
9
Difficulties
• Difficulty in accessing data
– Find all customers in Houston sdfsdfsdfwefwefwefwf
sdfsdfsdfdwwefwefwef
fsdfssdfsdfwefwefwef
sdfweffwefwefwefwef
sdffsdffsdfsdfwefwefw
sdfsdfsddfsdwefwefwe
fsdfsdfsdfsdfwefwefwe
sdfsdfsdfsdfsdffwefwef
sdfsdfsdfsdfsdfwefwefw
sdfsdffsdfsdfsdfwefwef
sdfsdfsdffsdfsdfsdfsdf

GetAllCust.exe sdfsfdfsfwef3rwefwefw

CustInf.txt
GetCustLoc.exe
Get < 1000.exe

10
Difficulties
• Data Isolation
– Since, data are scattered in various files, and files
may be in different format, writing new
application programs to retrieve appropriate data
is difficult.
SavingAC.txt CustInfo.txt CheckAC.txt

R1;R2;R3; R1,R2,R3, R1 R2 R3
R4;R5;R6; R4,R5,R6, R4 R5 R6
R7;R8 R7,R8 R7 R8
11
Difficulties
• Integrity Problems
– data values stored in the database must satisfy
consistency constraints
• Example:
– Account balance < $100
– Probation student can’t take more than 2 courses
– enforced by code in programs
– new constraints? modify existing constraints?
– must re-code, re-compile etc.
– more difficult when constraints involve several
data items from different files
12
Difficulties
• Atomicity
– if failure happens, data must be restored to the consistent
state prior to failure
– operation must be atomic i.e. happen fully or not at all
– difficult to ensure atomicity in a traditional file-processing
system
BOOM
SavingAC.txt CheckAC.txt

Transfer.exe
-$50 +$50

13
Balance wrong in checking a/c
Difficulties
• Concurrent-Access Done correctly

Balance Correct

-$50 -$100
WithD.exe WithD.exe
SavingAC.txt
Sees balance = $500 Balance = $500 Sees balance = $500
Sees balance = $450
Balance = $450
Balance = $350

14
Difficulties
• Concurrent-Access done incorrectly

Balance Incorrect

-$50 -$100
WithD.exe WithD.exe
SavingAC.txt
Sees balance = $500 Sees balance = $500
Balance = $500
Balance = $450 or $400
15
Difficulties
• Security
– Every user of the database system should not be
able to access all the data.
– prevent authorized users from accessing data
they don’t need
– E.g: Banking System:
• Application programs are added in an ad hoc
manner
• Enforcing security constraints is difficult

16
Difficulties
• These problems are not just found in file-
processing type databases
• Also found in DBMS
• However, DBMS have facilities to help
overcome these difficulties
• Database systems offer solutions to all the
above problems
• These facilities coupled with sound database
design help to overcome these issues
17
Data Abstraction
• One purpose of a DBMS is to provide users with
abstract views of the data
• Efficiency = complex data structures
• This complexity can be hidden through different
‘views’; of the data
• Several layers of abstraction
– Physical level:
• describes how a record (e.g., customer) is stored.
– Logical level:
• describes what data is stored in database, and the
relationships among the data.
– View level:
• application programs hide details of data types.
• views can also hide information (such as an employee’s
salary, SSN) for security purposes. 18
Data Abstraction
Only certain data visible
View Level

View 1 View 2 View 3 View 4

Logical Level What

Physical Level How

19
Data Abstraction

• Programming analogy User


Application

typedef struct {
View
int cusnum;
char *cusname; Logical
char *cusaddress;
} customer; Physical
Stored as series
of bytes

20
Schemas and Instances
 Overall design, logical  The actual content of
structure, of database is the database at a
known as the schema particular point of time
is know as an instance

Physical schema: database


design at the physical level The information in a
Logical schema: database database changes over time.
design at the logical level Deletion, insertion,
21
modification
Schemas and Instances
• Programming analogy OOP
Class = Schema
typedef struct {
Object = Instance
int cusnum;
char *cusname;
char *cusaddress; customer cus1;
} customer;
Instance
Schema
 Physical Data Independence
 The ability to modify the physical schema without
changing the logical schema
 Applications depend on the logical schema 22
Database Design
• The process of designing the general structure of
the database:
• Logical Design – Deciding on the database
schema. Database design requires that we find a
“good” collection of relation schemas.
– Business decision – What attributes should we record in
the database?
– Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
• Physical Design – Deciding on the physical layout
of the database 23
Data Models
• Underlying structure of a database
• Collection of tools and techniques that
describe
– data
– data relationships
– data semantics
– data constraints
• We have two major types of data models:
– Object Based Logical Model
– Record Based Logical Model
24
Data Models

• Object Based Logical Model, most popular are:


• Entity-relationship data model (ER Model)
• Object-based data model (Object-oriented and
Object-relational)

• Record Based Logical Model, most popular are:


• Relational model
• Network model
• Hierarchical model

25
Data Models
• Entity-Relationship Model
– based on perception of the real world
– translate the enterprise into a blueprint
– consists of:
• entities
• attributes
• relationships
• constraints
– cardinality ratio
– participation constraints
– existence dependency
26
Data Models

ATTRIBUTES
Customer Saving
Relationship
Account
Name
ENTITY

Address ATTRIBUTES
001-223-984 A/C number
Phone ENTITY Balance
Number

27
Data Models

name phone-number

address
acc-number balance

customer depositor account

Equivalent E-R Diagram


28
Data Models
• Object-based data Model
– Object-Oriented Model
• collection of objects
• values stored in instant variables
• bodies of code called methods
• message passing
• each model has its own unique identity
• object store can be used to persist objects for later
manipulation

29
Data Models
• Object-based data Model
– Object-relational Model
• An object-relational database (ORD), or object-
relational database management system
(ORDBMS), is a database management system
(DBMS) similar to a relational database, but with
an object-oriented database model:
– objects, classes and inheritance are directly supported
in database schemas and in the query language.
• An object-relational database can be said to provide
a middle ground between relational databases
and object-oriented databases (object database).
30
Data Models
• Record-Based Logical Models
– describes data at logical and view levels
– also provides description of the
implementation
– structured in fixed-format records
• relational model
• network model
• hierarchical model

31
Data Models
• Relational Model Columns

– uses collection of tables


to represent data and
relationships Rows
– each table has a number
of columns with a
unique name (fields)
– each row comprises of
one record entry (one
instance of an entity)

32
Data Models
• Network Model
– collection of records (C/C++ structure)
– each instance of record contains a single record
– relationships represented as links (pointers)

coburn ireland 322323 2665 2000


clinton w house 322349 3243 300000
roy scotland 234234 4543 50000

33
Data Models
• Hierarchical Model
– similar to network model
– records organized as collections of trees

coburn ireland 322323

clinton w house 322349


3243 300000

2665 2000

34
Semistructured Data model
XML: Extensible Markup Language
• Defined by the WWW Consortium (W3C)
• Originally intended as a document markup
language not a database language
• The ability to specify new tags, and to create nested
tag structures made XML a great way to exchange
data, not just documents
• XML has become the basis for all new generation
data interchange formats.
• A wide variety of tools are available for parsing,
browsing and querying XML documents/data
35
Database Languages
 DBMS provide two types of language
– One to specify schema and create the database
– One to express database queries and updates
1. Data-Definition Language
– Schema is specified by a set of definitions expressed by the DDL
– Result is set of tables stored in the Data Dictionary
– Data Dictionary is a file that contains metadata, data about data
2. Data-Manipulation Language
– Language for accessing and manipulating the data organized by
the appropriate data model. That is, data retrieval, insertion,
deletion, modification
– DML also known as query language
– Physical level  efficiency
– Higher levels  ease of use 36
Database Languages
• Example of SQL DDL
create table student
(id char(10) not null,
name varchar(30) not null,
degree varchar(10),
address char(50),
primary key (id),
check (degree in (“Bachelors”, “Masters”, “Doctorate”)))
• DDL compiler generates a set of table templates stored in a data
dictionary. Data dictionary contains metadata (data about data)
– Database schema
– Integrity constraints
• Primary key (ID uniquely identifies instructors)
– Authorization
• Who can access what 37
Database Languages
 Two classes of languages
– Procedural – user specifies what data is required
and how to get those data.
– Declarative (nonprocedural) – user specifies what
data is required without specifying how to get those
data
 SQL: widely used non-procedural query language
 Example of SQL DML:
 Query-1:
Select name, id
From student
Where degree = ‘Bachelors’;
 Query-2:
Select name, address, degree
From student
Where id = ‘021000040’; 38
Database Languages
Popular database language
 SQL (Structured Query Language)
 Quel
 Datalog
 QBE (Query by Example)

39
Database Users
 Administrator (DBA)
 Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
– schema definition
– storage structure
– granting authorization for data access
– schema and physical organization modification
– integrity-constraint specification
– routine maintenance
• Database Backup
• Ensure free space
• Performance of the Database
– Eliminate expensive tasks 40
Database Users
 Application Programmers
– interact with system through DML calls
embedded in a host language
– uses DML pre-compiler
– may also interact with databases through
bridges e.g. ODBC, JDBC
 Sophisticated Users
– interact with database using database query
language
– submitted to the query processor

41
Database Users
 Specialized Users
– sophisticated users who write specialized database
applications
• expert systems
• graphical databases
 Naïve Users
– unsophisticated users who interact with database system
through one or more application programs
• Example: bank teller who needs to transfer $50 from account A
to account B.
• people accessing database over the web, bank tellers, clerical
staff
42
Transaction Management
• A transaction is a collection of operations that
performs a single logical function in a database
application
• Transaction-management component ensures
that the database remains in a consistent (correct)
state despite system failures (e.g., power failures
and operating system crashes) and transaction
failures.
• Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
43
Storage Management
• Storage manager is a program module that provides
the interface between the low-level data stored in the
database and the application programs and queries
submitted to the system.
• The storage manager is responsible to the following
tasks:
– Interaction with the file manager
– Efficient storing, retrieving and updating of data
• Issues:
– Storage access
– File organization
– Indexing and hashing 44
Storage Management
• Storage Manager Components
– Authorization and Integrity Manager
• tests integrity constraints and checks user authorization
– Transaction Manager
• ensures database remains in consistent state despite system failure
and usually incorporates concurrency-control manager which
controls the interaction among the concurrent transactions, to
ensure the consistency of the database.
– File Manager
• manages allocation of space on-disk storage
– Buffer Manager
• responsible for swapping data from disk storage to main memory
• decides what data to cache
45
Database Architecture

• The architecture of a database systems is


greatly influenced by the underlying
computer system on which the database is
running:
– Centralized
– Client-server
– Parallel (multi-processor)
– Distributed

46
Overall System Structure
 Query Processor Components
– DML Compiler
• translates DML statements into low-level instructions the query
evaluation engine understands
• may try to optimize user queries
– Embedded DML Pre-compiler
• converts DML statements embedded in an application program to
normal calls in the host language
• interacts with DML Compiler
– DDL Interpreter
• interprets DDL statements and records them in the data dictionary
– Query Evaluation Engine
• executes low-level instructions generated by the DML Compiler
47
Overall System Structure

48
Thank You All For Your
Attention

49

You might also like