CH 1
CH 1
● Computational Tasks:
– Airline reservation
– Banking
– Watching online video
– Writing a paper
– Training and running a machine learning model
– Searching a document
– Searching a file system
– Searching the internet
5
Formal Database
Advantages
● Scalability
● Multiple Access
● Data Security
● Uniform Data Governance
● Efficiency
● Reliability
● Data Integrity
● Crash Recovery
● Maintainability
● Physical Data Independence
● Atomicity
● Data Centralization
6
Informal Database
Advantages
● Flexibility
● Schema changes
7
What context is each one
better for?
8
OLTP (Online Transaction
Processing)
9
OLAP (Online Analytic
Processing)
10
What context is each one
better for?
11
Outline
Database-System Applications
Purpose of Database Systems
View of Data
Database Languages
Database Design
Database Engine
Database Architecture
Database Users and Administrators
History of Database Systems
Database Systems
DBMS contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and efficient to use
Database systems are used to manage collections of data that are:
Highly valuable
Relatively large
Accessed by multiple users and applications, often at the same
time.
A modern database system is a complex software system whose task is
to manage a large, complex collection of data.
Databases touch all aspects of our lives
Database Applications Examples
▪ Enterprise Information
Sales: customers, products, purchases
Accounting: payments, receipts, assets
Human Resources: Information about employees, salaries, payroll
taxes.
▪ Manufacturing: management of production, inventory, orders, supply
chain.
▪ Banking and finance
customer information, accounts, loans, and banking transactions.
Credit card transactions
Finance: sales and purchases of financial instruments (e.g., stocks
and bonds; storing real-time market data
▪ Universities: registration, grades
Database Applications Examples (Cont.)
In the early days, database applications were built directly on top of file
systems, which leads to:
▪ Atomicity of updates
Failures may leave database in an inconsistent state with partial updates
carried out
Example: Transfer of funds from one account to another should either
complete or not happen at all
▪ Concurrent access by multiple users
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Ex: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
▪ Security problems
Hard to provide user access to some, but not all, data
Rows
Ted Codd
Turing Award 1981
A Sample Relational Database
Levels of Abstraction
Physical level: describes how a record (e.g., instructor) is stored.
Logical level: describes data stored in database, and the relationships
among the data.
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
View level: application programs hide details of data types. Views can
also hide information (such as an employee’s salary) for security
purposes.
View of Data
An architecture for a database system
Instances and Schemas
PRO CONs
▪ A database system is partitioned into modules that deal with each of the
responsibilities of the overall system.
▪ The functional components of a database system can be divided into
The storage manager,
The query processor component,
The transaction management component.
Storage Manager
▪ A program module that provides the interface between the low-level data
stored in the database and the application programs and queries
submitted to the system.
▪ The storage manager is responsible to the following tasks:
Interaction with the OS file manager
Efficient storing, retrieving and updating of data
▪ The storage manager components include:
Authorization and integrity manager
Transaction manager
File manager
Buffer manager
Storage Manager (Cont.)
▪ Centralized databases
One to a few cores, shared memory
▪ Client-server,
One server machine executes work on behalf of multiple client
machines.
▪ Parallel databases
Many core shared memory
Shared disk
Shared nothing
▪ Distributed databases
Geographical distribution
Schema/data heterogeneity
Database Architecture
(Centralized/Shared-Memory)
Database Applications
▪ Schema definition
▪ Storage structure and access-method definition
▪ Schema and physical-organization modification
▪ Granting of authorization for data access
▪ Routine maintenance
▪ Periodically backing up the database
▪ Ensuring that enough free disk space is available for normal
operations, and upgrading disk space as required
▪ Monitoring jobs running on the database
History of Database Systems
1980s:
Research relational prototypes evolve into commercial systems
SQL becomes industrial standard
Parallel and distributed database systems
Wisconsin, IBM, Teradata
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
History of Database Systems (Cont.)
2000s
Big data storage systems
Google BigTable, Yahoo PNuts, Amazon,
“NoSQL” systems.
Big data analysis: beyond SQL
Map reduce and friends
2010s
SQL reloaded
SQL front end to Map Reduce systems
Massively parallel database systems
Multi-core main-memory databases
End of Chapter 1