Chapter 1
Chapter 1
First 6.5 weeks: By Dr. Liu Jie. Xiamen University Malaysia. [email protected]
Textbook for the first 7 weeks:
Database System Concepts, 7th edition, by Silberschatz, Korth, Sudarshan.
Chap1. Introduction + MySQL installation
Chap2. Introduction to the Relational Model
Chap3. Introduction to SQL
Chap4. Intermediate SQL
Chap5. Advanced SQL
Chap9. Application: HTML+PHP+MySQL
• Reference Book for Chap9: Learning PHP, MySQL & JavaScript, A Step-by-Step Guide to Creating
Dynamic Websites, 6th edition, by R. Nixon.
Another Application : An e-commerce example from Kaggle
• https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce
Grading Policy:
4 projects. 60%. (submit .sql file through canvas.)
• Project 1: assigned after Chap3 (query a single table).
• Project 2: assigned after Chap9 (query multiple tables).
• Projects 3,4: assigned by Dr Yang
Final Exam. 40%. (closed book, no laptop, A4-sized helpsheet.)
Chapter 1: Introduction
§ Database-System Applications
§ Purpose of Database Systems
§ View of Data
§ Database Languages
§ Database Design
§ Database Engine
§ Database Architecture
§ Database Users and Administrators
§ History of Database Systems
§ Appendix: Install and run MySQL
Database System Concepts - 7th Edition 1.3 ©Silberschatz, Korth and Sudarshan
Database Systems
§ DBMS (Database Management System) contains information about a
particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
§ Database systems are used to manage collections of data that are:
• Highly valuable
• Relatively large
• Accessed by multiple users and applications, often at the same time.
§ A modern database system is a complex software system whose task is to
manage a large, complex collection of data.
§ Databases touch all aspects of our lives
Database System Concepts - 7th Edition 1.4 ©Silberschatz, Korth and Sudarshan
Database Applications Examples
§ Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll
taxes.
§ Manufacturing: management of production, inventory, orders, supply
chain.
§ Banking and finance
• customer information, accounts, loans, and banking transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments (e.g., stocks
and bonds; storing real-time market data
§ Universities: registration, grades
Database System Concepts - 7th Edition 1.5 ©Silberschatz, Korth and Sudarshan
Database Applications Examples (Cont.)
Database System Concepts - 7th Edition 1.6 ©Silberschatz, Korth and Sudarshan
An example that we will go through later
Database System Concepts - 7th Edition 1.7 ©Silberschatz, Korth and Sudarshan
Purpose of Database Systems
In the early days, database applications were built directly on top of file
systems, which leads to:
Database System Concepts - 7th Edition 1.8 ©Silberschatz, Korth and Sudarshan
Purpose of Database Systems (Cont.)
§ Atomicity of updates
• Failures may leave database in an inconsistent state with partial
updates carried out
• Example: Transfer of funds from one account to another should either
complete or not happen at all. All or Nothing.
§ Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
§ Ex: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
§ Security problems
• Hard to provide user access to some, but not all, data
Database System Concepts - 7th Edition 1.9 ©Silberschatz, Korth and Sudarshan
University Database Example
Database System Concepts - 7th Edition 1.10 ©Silberschatz, Korth and Sudarshan
Install MySQL and create university database
(appendix)
Database System Concepts - 7th Edition 1.11 ©Silberschatz, Korth and Sudarshan
View of Data
Database System Concepts - 7th Edition 1.12 ©Silberschatz, Korth and Sudarshan
Categories of Data Models
§ Relational model. (most widely used data model)
• a collection to tables. (e.g. three tables: instructor, student, advisor)
§ Entity-Relationship data model (chapter 6)
• mainly for database design.
instructor student
ID advisor ID
name name
salary tot_cred
Database System Concepts - 7th Edition 1.13 ©Silberschatz, Korth and Sudarshan
Relational Model
Columns
Rows
Ted Codd
Turing Award 1981
Database System Concepts - 7th Edition 1.14 ©Silberschatz, Korth and Sudarshan
A Sample Relational Database
Database System Concepts - 7th Edition 1.15 ©Silberschatz, Korth and Sudarshan
Semi-Structured Data
Example from Chapter 8
<purchase order>
<identifier> P-101 </identifier>
<purchaser>
<name> Cray Z. Coyote </name>
<address> Route 66, Mesa Flats, Arizona 86047, USA
</address>
</purchaser>
<supplier>
<name> Acme Supplies </name>
XML: extensible
</supplier>
<address> 1 Broadway, New York, NY, USA </address> markup language
<itemlist> use user-defined tags to mark up
<item> information
<identifier> RS1 </identifier>
<description> Atom powered rocket sled </description>
<quantity> 2 </quantity>
<price> 199.95 </price>
</item>
<item>…</item>
</itemlist>
<total cost> 429.85 </total cost>
….
</purchase order>
Python Dictionary
Database System Concepts - 7th Edition 1.16 ©Silberschatz, Korth and Sudarshan
Semi-Structured Data
https://www.digitalocean.com/community/tutorials/an-introduction-to-json
<users>
<user>
<username>SammyShark</username> <location>Indian Ocean</location>
</user>
<user> XML: extensible
<username>JesseOctopus</username> <location>Pacific Ocean</location> markup language
</user> use tags to mark up information
<user>
<username>DrewSquir</username> <location>Atlantic Ocean</location>
</user>
<user>
<username>JamieMantisShrimp</username> <location>Pacific Ocean</location>
</user>
</users>
{"users": [
{"username" : "SammyShark", "location" : "Indian Ocean"},
{"username" : "JesseOctopus", "location" : "Pacific Ocean"}, JSON: JavaScript Object Notation
a collection of key:value pairs
{"username" : "DrewSquid", "location" : "Atlantic Ocean"},
{"username" : "JamieMantisShrimp", "location" : "Pacific Ocean"}
]}
Database System Concepts - 7th Edition 1.17 ©Silberschatz, Korth and Sudarshan
Levels of Abstraction
• Logical level: describes what data are stored, and the relationships
among the data. (Physical data independence.)
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
• View level: application programs hide details of data types. Views can
also hide information (such as an employee’s salary) for security
purposes.
Database System Concepts - 7th Edition 1.18 ©Silberschatz, Korth and Sudarshan
Physical level: how to store variable-length records
(Chapter 13.2.2)
Database System Concepts - 7th Edition 1.19 ©Silberschatz, Korth and Sudarshan
Physical level: Slotted Page Structure
(Chapter 13.2.2)
Grow
Block Header towards here Records
§ Slotted page header contains (here, a page is just a block which can
be 4 to 8 kilobytes):
• number of record entries
• end of free space in the block
• location and size of each record
§ Records can be moved around within a page to keep them contiguous
with no empty space between them; entry in the header must be
updated.
Database System Concepts - 7th Edition 1.20 ©Silberschatz, Korth and Sudarshan
View of Data
view level
logical
level
physical
level
Database System Concepts - 7th Edition 1.21 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
§ The overall database design is called the database schema. The information stored in
the database at a particular moment is called the an instance.
Similar to types and variables in programming languages.
database schema = variable declaration, database instance = value of a variable.
struct: blueprint, object: instance class: blueprint, object: instance
Database System Concepts - 7th Edition 1.22 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
§ Logical Schema – the overall logical structure of the database
• Example: The database consists of information about a set of
customers and accounts in a bank and the relationship between
them
§ Analogous to type information of a variable in a program
§ Physical schema – the overall physical structure of the database
Database System Concepts - 7th Edition 1.23 ©Silberschatz, Korth and Sudarshan
Physical Data Independence
Database System Concepts - 7th Edition 1.24 ©Silberschatz, Korth and Sudarshan
Data Definition Language (DDL)
Database System Concepts - 7th Edition 1.25 ©Silberschatz, Korth and Sudarshan
Data Definition Language: MySQL Example
Database System Concepts - 7th Edition 1.26 ©Silberschatz, Korth and Sudarshan
metadata example: MySQL
Database System Concepts - 7th Edition 1.27 ©Silberschatz, Korth and Sudarshan
Data Manipulation Language (DML)
Database System Concepts - 7th Edition 1.28 ©Silberschatz, Korth and Sudarshan
DML example: MySQL
Database System Concepts - 7th Edition 1.29 ©Silberschatz, Korth and Sudarshan
SQL Query Language
Database System Concepts - 7th Edition 1.30 ©Silberschatz, Korth and Sudarshan
Database Access from Application Program
§ SQL does not support actions such as input from users, output to
displays, or communication over the network.
§ Such computations and actions must be written in a host language, such
as C/C++, Java or Python, with embedded SQL queries that access the
data in the database.
§ Application programs -- are programs that are used to interact with the
database in this fashion.
Database System Concepts - 7th Edition 1.31 ©Silberschatz, Korth and Sudarshan
Access databases in Python (details will be presented in chapter 5)
§ Many open source libraries allow Python to interact with a MySQL database.
Database System Concepts - 7th Edition 1.32 ©Silberschatz, Korth and Sudarshan
Database Design
Database System Concepts - 7th Edition 1.33 ©Silberschatz, Korth and Sudarshan
Database Engine
§ A database system is partitioned into modules that deal with each of the
responsibilities of the overall system.
§ The functional components of a database system can be divided into
• The storage manager,
• The query processor component,
• The transaction management component.
Database System Concepts - 7th Edition 1.34 ©Silberschatz, Korth and Sudarshan
Database Architecture
(Centralized/Shared-Memory)
• The storage manager handles all the queries submitted to
the system.
• Buffer manager: memory management, decide
what to put in the main memory and what to put
in the cache.
• File manager: allocate space on the disk storage
and mange the data structure to store the data
• Authorization and integrity manager: test for the
satisfaction of integrity constraints, manage who
can access what.
• Transaction manager: ensure the database is
consistent and concurrent operations are not
conflicting
• Disk storage
• Data dictionary: metadata information. E.g.,
Name of the relation, names of the attributes of
each relation
• Statistical data: the statistics about the data.
E.g., number of tuples in a relation r, size of tuple
of relation r, number of distinct values that
appears in relation r for attribute A, ……
• Indices: for fast retrieval of data.
• Query Processor
• DDL interpreter: DDL deals with the schema of
the table. DDL contains creating/deleting tables,
adding/deleting column. Write to the data
dictionary.
• DML compiler. DML: selecting, updating,
inserting, deleting data from databases. DML
compiler translate DML into an optimal
evaluation plan which is then executed by the
query evaluation engine.
Database System Concepts - 7th Edition 1.35 ©Silberschatz, Korth and Sudarshan
Storage Manager
§ A program module that provides the interface between the low-level data stored
in the database and the application programs and queries submitted to the
system.
§ The storage manager is responsible to the following tasks:
• Interaction with the OS file manager
• Efficient storing, retrieving and updating of data.
§ Goal: to minimize the number of block transfers between the disk and memory by
maximize the chance that when a block is required, it is already in the memory.
§ The storage manager
components:
• Authorization
and integrity manager
• Transaction manager
• File manager
• Buffer manager
Database System Concepts - 7th Edition 1.36 ©Silberschatz, Korth and Sudarshan
Why do we want to minimize the number of block
transfers between the disk and memory?
1 ms = 1000000 ns
Database System Concepts - 7th Edition 1.37 ©Silberschatz, Korth and Sudarshan
Storage Manager (Cont.)
Database System Concepts - 7th Edition 1.38 ©Silberschatz, Korth and Sudarshan
Query Processor
Database System Concepts - 7th Edition 1.39 ©Silberschatz, Korth and Sudarshan
Query Processing
optimizer
query
evaluation engine execution plan
output
data statistics
about data
Database System Concepts - 7th Edition 1.40 ©Silberschatz, Korth and Sudarshan
Transaction Management
Database System Concepts - 7th Edition 1.41 ©Silberschatz, Korth and Sudarshan
Database Architecture
(chapter 20)
§ Centralized databases
• One to a few cores, shared memory, run on a single computer.
§ Client-server,
• One server machine executes work on behalf of multiple client
machines.
§ Parallel databases
• Many core shared memory
• Shared disk
• Shared nothing (meaning each node consists of a processor,
memory and one or more disks; nodes commute by network)
§ Distributed databases
• Geographical distribution
• Schema/data heterogeneity
Database System Concepts - 7th Edition 1.42 ©Silberschatz, Korth and Sudarshan
Database Architecture
(Centralized/Shared-Memory)
• The storage manager handles all the queries submitted to
the system.
• Buffer manager: memory management, decide
what to put in the main memory and what to put
in the cache.
• File manager: allocate space on the disk storage
and mange the data structure to store the data
• Authorization and integrity manager: test for the
satisfaction of integrity constraints, manage who
can access what.
• Transaction manager: ensure the database is
consistent and concurrent operations are not
conflicting
• Disk storage
• Data dictionary: metadata information. E.g.,
Name of the relation, names of the attributes of
each relation
• Statistical data: the statistics about the data.
E.g., number of tuples in a relation r, size of tuple
of relation r, number of distinct values that
appears in relation r for attribute A, ……
• Indices: for fast retrieval of data.
• Query Processor
• DDL interpreter: DDL deals with the schema of
the table. DDL contains creating/deleting tables,
adding/deleting column. Write to the data
dictionary.
• DML compiler. DML: selecting, updating,
inserting, deleting data from databases. DML
compiler translate DML into an optimal
evaluation plan which is then executed by the
query evaluation engine.
Database System Concepts - 7th Edition 1.43 ©Silberschatz, Korth and Sudarshan
Database Applications
Database System Concepts - 7th Edition 1.44 ©Silberschatz, Korth and Sudarshan
Two-tier and three-tier architectures
Database System Concepts - 7th Edition 1.46 ©Silberschatz, Korth and Sudarshan
Database Administrator
A person who has central control over the system is called a database
administrator (DBA). Functions of a DBA include:
§ Schema definition
§ Storage structure and access-method definition
§ Schema and physical-organization modification
§ Granting of authorization for data access
§ Routine maintenance
§ Periodically backing up the database
§ Ensuring that enough free disk space is available for normal
operations, and upgrading disk space as required
§ Monitoring jobs running on the database
Database System Concepts - 7th Edition 1.47 ©Silberschatz, Korth and Sudarshan
History of Database Systems
Database System Concepts - 7th Edition 1.48 ©Silberschatz, Korth and Sudarshan
History of Database Systems (Cont.)
§ 1980s:
• Research relational prototypes evolve into commercial systems
SQL becomes industrial standard
§
• Parallel and distributed database systems
Wisconsin, IBM, Teradata
§
• Object-oriented database systems
§ 1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
Database System Concepts - 7th Edition 1.49 ©Silberschatz, Korth and Sudarshan
History of Database Systems (Cont.)
§ 2000s
• Big data storage systems
§ Google BigTable, Yahoo PNuts, Amazon,
§ “NoSQL” systems.
• Big data analysis: beyond SQL
§ Map reduce and friends
§ 2010s
• SQL reloaded
§ SQL front end to Map Reduce systems
§ Massively parallel database systems
§ Multi-core main-memory databases
Database System Concepts - 7th Edition 1.50 ©Silberschatz, Korth and Sudarshan
End of Chapter 1
Database System Concepts - 7th Edition 1.51 ©Silberschatz, Korth and Sudarshan
Homework:
• Install MySQL on your laptop or desktop and create the university
database. Instructions are given in chapter1_appendix.pdf
• Optional: Think about and then read the solutions to Questions
1.3, 1.7, 1.10, 1.15 on Page 32 of the textbook.
Database System Concepts - 7th Edition 1.52 ©Silberschatz, Korth and Sudarshan