0% found this document useful (0 votes)
93 views51 pages

Lectures Week1

This document provides an overview of the CS203 Database Systems course taught by Dr. Syed Asif Raza at FAST-NUCES, Karachi Campus. The course covers fundamental database concepts, principles, and techniques from a user perspective. It focuses on database functionality rather than implementation. The document outlines the instructor's background and research interests, course objectives, reference materials, grading scheme, and contact information. It also provides examples of database applications and compares manual, file-based, and database approaches to data management.

Uploaded by

Syed Asif Raza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views51 pages

Lectures Week1

This document provides an overview of the CS203 Database Systems course taught by Dr. Syed Asif Raza at FAST-NUCES, Karachi Campus. The course covers fundamental database concepts, principles, and techniques from a user perspective. It focuses on database functionality rather than implementation. The document outlines the instructor's background and research interests, course objectives, reference materials, grading scheme, and contact information. It also provides examples of database applications and compares manual, file-based, and database approaches to data management.

Uploaded by

Syed Asif Raza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

CS203

Database System
Week 1

Dr. Syed Asif Raza, Assistant Professor


Department of Computer Science
FAST-NUCES, Karachi Campus
[email protected]

Courtesy:
Welcome!
About me
 PIMSAT, Karachi
 BS Information Technology ‘06

 SZABIST, Karachi
 MS Computer Science ‘12

 University of Science and Technology (UST), South Korea


 PhD Computer Science ’18

 Research Interests
 Software-Defined Networking (SDN), Network Function Virtualization (NFV),
Cloud Computing, Virtualization
 Network Security, Blockchain Technology, HPC/HTC

 Industry Experiences
 Korea Institute of Science and Technology Information (KISTI), S. Korea
 Fermi National Accelerator Lab (FNAL), USA
 NESCOM HQ, Islamabad,
 National Telecommunication Corporation (NTC) HQ, Islamabad
 NADRA RHQ, Pakistan
Overview
 This course provides students with the essential concepts, principles, and
techniques of modern database systems from a user perspective.

 Focuses on the functionalities that are offered by database systems and


not on the methods to implement them.

 The course teaches students the ability to develop a solution for a real-
world data management problem that requires the application of the
theories and practices developed in class.

 From a theoretical point of view, this course covers the essential


principles for the design, analysis, and use of computerized database
systems.

 The design and techniques of conceptual modeling, database modeling,


database system architecture, and user/program interfaces are presented
in a unified way.
Objectives and goals

 To be able to understand the underlying concepts


of database, and database management systems
(DBMS)
 To introduce the concept of rational data models
 Analysis and design of database application or
information system
 Experience with SQL
 Implementation of database
Reference Books
 Textbook (or Laboratory Manual for Laboratory Courses):
 Ramez Elmasri & Shamkant B. Navathe, Database Systems,
Models, Languages, Design and Application Programming, 7th Edition,
2016.

 Reference Books
 Thomas Connolly, Carolyn Begg, Database Systems: A practical
approach to design, implementation and Management, 6th Edition,
2015.
 C.J. Date, An Introduction to Database Systems, 8th Edition, 2004
Grading/Assignments/Project
 Grade breakdown
 Term exams (1 & 2) 30%
 Assignments/Quiz 10%
 Project 10%
 Final exam 45%
 Class participation 5%
 All assignments will be in latex format
 Final Project & Term paper
 2~3 members per group
 Plagiarism will be marked as Zero.
 Passing Marks = 50
 Class Attendance should be >= 80.
 No Student will be allowed after 10 Mins starting of the
class.
Contact and Course Logistics
 Instructor: Dr. Syed Asif Raza
 Email: [email protected]
 Contact Hours:
 2:00-3:30pm
 Monday & Tue.
 Office hours
 Office: Compute Science building, Basement 2, room #9

 Course Website
 https://slate.nu.edu.pk
 Check often for announcements
 Assignments/Projects
 Discussion/Help
Databases Applications: Examples

 Supermarkets?
 Credit cards?
 Travel agents?
 Library?
 Insurance?
 University?
 Etc.
Manual filing systems
 Works well

 While number of items to be stored is small

 For only storage or retrieval functionality or large


number of items
File-based Systems

 Early attempt to computerize manual filing system

 Collection of application programs that perform


services for the end users (e.g. reports)

 Each program define and manage its own data


File-based Systems
 Consider the DreamHome example for file-based
system:
 Sales department: responsible for selling and renting
properties

 Contract Department: responsible for handling lease


agreements
Sales department
 PropertyForRent
 PropertyNo, street, city, type, rooms, ownerNo

 Client
 clientNo, fName, Iname, telNo, prefType, maxRent

 PrivateOwner
 ownerNo, fName, Iname, address, telNo
Lease Department
 Lease
 leaseNo, propertyNo, clientNo, rent, paymentMethod, deposit,
paid, rentStart, rentFinish, duration
 PropertyForRent
 propertyNo, street, city, postcode, type, rooms, rent
 Client
 clientNo, fName, Iname, telNo, prefType, telNo
Limitations of file-based systems
 Seperation and isolation of data
 Each program maintain its own data

 Users of one program may unaware of potential useful data held by


other programs
 For example: if we want to produce all houses that match the
requirements of the clients
 Duplication of data
 Decentralized approach taken by each department

 Same data in different programs

 Waste of space

 Data dependence
 File structure is defined in program code

 Incompatible file formats


 Programs written in different languages

 New requirements need new programs


Database approach!

 Definition of data embedded in application programs


 No separate or independent storage of data
 No control over access and manipulations of data beyond
the imposed by the application

Result!
Database Management System (DBMS)
What Is a Database System?
Basic Definitions
 Database:
 A collection of related data.

 Data:
 Known facts that can be recorded and have an implicit meaning.

 Mini-world:
 Some part of the real world about which data is stored in a
database. For example, student grades and transcripts at a
university.
 Database Management System (DBMS):
 A software package/ system to facilitate the creation and
maintenance of a computerized database.
 Database System:
 The DBMS software together with the data itself. Sometimes, the
applications are also included.
Database: What
 Database
 is collection of related data and its metadata organized in a structured
format
 for optimized information management

 Database Management System (DBMS)


 is a software that enables easy creation, access, and modification of
databases
 for efficient and effective database management

 Database System
 is an integrated system of hardware, software, people, procedures, and data
 that define and regulate the collection, storage, management, and use of
data within a database environment
Database: Why
 Purpose of Database
 Optimizes data management
 Transforms data into information
 Importance of Database Design
 Defines the database’s expected use
 different approach needed for different types of databases
 Avoid data redundancy & ensure data integrity
 data is accurate and verifiable
 Poorly designed database generates errors
 leads to bad decisions
 can lead to failure of organization

 Functions of DBMS/Database System


 Stores data and related data entry forms, report definitions, etc.
 Hides the complexities of relational database model from the user
 facilitates the construction/definition of data elements and their relationships
 enables data transformation and presentation
 Enforces data integrity
 Implements data security management
 access, privacy, backup & restoration
Database: How
 Planning & Analysis
 Assess
 Goal of the organization
 Database environment
 existing hardware, software, raw data, data processing procedures
 Identify
 Database needs
 what database can do to further the goal of the organization
 User needs and characteristics
 who the users are, what they want to do, how they envision doing it
 Database system requirements
 what the database system should do to satisfy the database and user needs
 Design
 From conceptual design to a detailed system specification

 Implementation
 Create the database

 Maintenance
 Troubleshoot, update, streamline the database
Business Rules
 What
 Brief, precise, and unambiguous descriptions of operations in an
organization
 based on policies, procedures, or principles within a specific organization
 help to create and enforce actions within that organization’s environment
 apply to any organization that stores and uses data to generate information
 Why
 Enhance understanding & facilitate communication
 Standardize company’s view of data
 Constitute a communications tool between users and designers
 Allow designer to understand business process as well as the nature, role, and
scope of data
 Promote creation of an accurate data model

 How (sources)
 Interviews
 Company managers
 Policy makers
 Department managers
 End users
 Written documentation
 Procedures, Standards, Operations manuals
 Observation
 Business operations
Database: User-centered
 Perspective
 The user is always right. If there is a problem with the use of the system,
the system is the problem, not the user.

 Compliance
 The user has the right to a system that performs exactly as promised.

 Instruction
 The user has the right to easy-to-use instructions (user guides, online or
contextual help, error messages) for understanding and utilizing a system to
achieve desired goals and recover efficiently and gracefully from problem
situations.

 Usability
 The user should be the master of software and hardware technology, not
vice-versa. Products should be natural and intuitive to use.
Database: Data Models
 Importance
 Abstraction of complex real-word data structures in relative simple
(graphical) representations
 Facilitate interaction among the designer, the applications
programmer, and the end user

 Basic Building Blocks


 Entity
 thing about which data are to be collected and stored
 Attribute
 a characteristic of an entity
 Relationship
 describes an association among entities
 Constraint
 restrictions placed on the data
Evolution of Data Models

 Timeline

1960s 1970s 1980s 1990s 2000+

File-based

Hierarchical
Object-
Network
oriented
Relational Web-based
Entity-Relationship
Simplified database system environment
Database Management System
- manages interaction between end users and database

Database Systems: Design, Implementation, & Management: Rob & Coronel


Database System Environment

 Hardware
 Software
- OS
- DBMS
- Applications
 People
 Procedures
 Data

Database Systems: Design, Implementation, & Management: Rob & Coronel


Typical DBMS Functionality
 Define a particular database in terms of its data types,
structures, and constraints
 Construct or Load the initial database contents on a
secondary storage medium
 Manipulating the database:
 Retrieval: Querying, generating reports
 Modification: Insertions, deletions and updates to its content
 Accessing the database through Web applications
 Processing and Sharing by a set of concurrent users and
application programs – yet, keeping all data valid and
consistent
Typical DBMS Functionality
 Other features:
 Protection or Security measures to prevent unauthorized
access
 “Active” processing to take internal actions on data
 Presentation and Visualization of data
 Maintaining the database and associated programs over the
lifetime of the database application
 Called database, software, and system maintenance
Example of a Database
(with a Conceptual Data Model)
 Mini-world for the example:
 Part of a UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs
Example of a Database
(with a Conceptual Data Model)
 Some mini-world relationships:
 SECTIONs are of specific COURSEs

 STUDENTs take SECTIONs

 COURSEs have prerequisite COURSEs

 INSTRUCTORs teach SECTIONs

 COURSEs are offered by DEPARTMENTs

 STUDENTs major in DEPARTMENTs

 Note: The above entities and relationships are typically


expressed in a conceptual data model, such as the ENTITY-
RELATIONSHIP data model (see later)
Example of a simple database
Main Characteristics of the Database
Approach
 Self-describing nature of a database system:
 A DBMS catalog stores the description of a particular
database (e.g. data structures, types, and constraints)
 The description is called meta-data.
 This allows the DBMS software to work with different database
applications.
 Insulation between programs and data:
 Called program-data independence.
 Allows changing data structures and storage organization
without having to change the DBMS access programs.
Example of a simplified database catalog
Main Characteristics of the Database
Approach (continued)
 Data Abstraction:
 A data model is used to hide storage details and present the
users with a conceptual view of the database.
 Programs refer to the data model constructs rather than data
storage details
 Support of multiple views of the data:
 Each user may see a different view of the database, which
describes only the data of interest to that user.
Main Characteristics of the Database
Approach (continued)
 Sharing of data and multi-user transaction
processing:
 Allowing a set of concurrent users to retrieve from and to
update the database.
 Concurrency control within the DBMS guarantees that each
transaction is correctly executed or aborted
 Recovery subsystem ensures each completed transaction has
its effect permanently recorded in the database
 OLTP (Online Transaction Processing) is a major part of
database applications. This allows hundreds of concurrent
transactions to execute per second.
Describing Data: Data Models
 A data model is a collection of concepts for
describing data.

 A schema is a description of a particular collection of


data, using a given data model.

 The relational model of data is the most widely used


model today.
 Main concept: relation, basically a table with rows and
columns.
 Every relation has a schema, which describes the
columns, or fields.
Levels of Abstraction
Users
 Views describe how users
see the data.

 Conceptual schema
defines logical structure View 1 View 2 View 3

Conceptual Schema
 Physical schema describes
Physical Schema
the files and indexes used.

 (sometimes called the DB


ANSI/SPARC model)
Example: University Database
 Conceptual schema:
 Students(sid: string, name: string, login:
string, age: integer, gpa:real)
 Courses(cid: string, cname:string, View 1 View 2 View 3
credits:integer)
 Enrolled(sid:string, cid:string, Conceptual Schema
grade:string)
 External Schema (View): Physical Schema
 Course_info(cid:string,enrollment:integer
)
 Physical schema: DB
 Relations stored as unordered files.
 Index on first column of Students.
Data Independence
 Applications insulated from
how data is structured and View 1 View 2 View 3
stored.
 Logical data independence:
Protection from changes in Conceptual Schema
logical structure of data.
Physical Schema
 Physical data independence:
Protection from changes in
physical structure of data. DB

 Q: Why are these particularly


important for DBMS?
Queries, Query Plans, and Operators
Count
Having
distinct
SELECT eid,
SELECT
COUNT
FROM
E.loc,
Emp
ename,
AVG(E.sal)
DISTINCT
E
title
(E.eid) 
FROM Emp
GROUP
WHERE BYE,E.loc
E.salProj P, Asgn A
> $50K
WHERE E.eid = A.eid

Group(agg)
HAVING Count(*) > 5
AND P.pid = A.pid Join
Select
AND E.loc <> P.loc
Join
 Proj
Emp
Emp Emp
Asgn
 System handles query plan
generation & optimization;
ensures correct execution. Employees
Projects
Assignments

• Issues: view reconciliation, operator ordering, physical operator


choice, memory management, access path (index) use, …
Concurrency Control
 Concurrent execution of user programs: key to good DBMS
performance.
 Disk accesses frequent, pretty slow

 Keep the CPU working on several programs concurrently.

 Interleaving actions of different programs: trouble!


 e.g., account-transfer & print statement at same time

 DBMS ensures such problems don’t arise.


 Users/programmers can pretend they are using a single-
user system. (called “Isolation”)
 Thank goodness! Don’t have to program “very, very

carefully”.
Transactions: ACID Properties
 Key concept is a transaction: a sequence of database actions
(reads/writes).

 DBMS ensures atomicity (all-or-nothing property) even if system


crashes in the middle of a Xact.
 Each transaction, executed completely, must take the DB between
consistent states or must not run at all.
 DBMS ensures that concurrent transactions appear to run in isolation.
 DBMS ensures durability of committed Xacts even if system crashes.

 Note: can specify simple integrity constraints on the data. The DBMS
enforces these.
 Beyond this, the DBMS does not understand the semantics of the
data.
 Ensuring that a single transaction (run alone) preserves
consistency is largely the user’s responsibility!
Ensuring Transaction Properites
 DBMS ensures atomicity (all-or-nothing property) even if
system crashes in the middle of a Xact.
 DBMS ensures durability of committed Xacts even if system
crashes.
 Idea: Keep a log (history) of all actions carried out by the
DBMS while executing a set of Xacts:
 Before a change is made to the database, the corresponding
log entry is forced to a safe location.
 After a crash, the effects of partially executed transactions are
undone using the log. Effects of committed transactions are
redone using the log.
 trickier than it sounds!
The Log

 The following actions are recorded in the log:


 Ti writes an object: the old value and the new value.

 Log record must go to disk before the changed page!


 Ti commits/aborts: a log record indicating this action.
 Log is often duplexed and archived on “stable” storage.
 All log related activities (and in fact, all concurrency control
related activities such as lock/unlock, dealing with deadlocks
etc.) are handled transparently by the DBMS.
Structure of a DBMS These layers
must consider
concurrency
control and
 A typical DBMS has a recovery
layered architecture. Query Optimization
 The figure does not show and Execution
the concurrency control and
Relational Operators
recovery components.
 Each database system has Files and Access Methods
its own variations.
Buffer Management

Disk Space Management

DB
Advantages of a DBMS

 Data independence
 Efficient data access
 Data integrity & security
 Data administration
 Concurrent access, crash recovery
 Reduced application development time
 So why not use them always?
 Expensive/complicated to set up & maintain

 This cost & complexity must be offset by need

 General-purpose, not suited for special-purpose tasks (e.g. text


search!)
Databases make these folks happy ...
 DBMS vendors, programmers
 Oracle, IBM, MS, Sybase, …

 End users in many fields


 Business, education, science, …

 DB application programmers
 Build enterprise applications on top of DBMSs

 Build web services that run off DBMSs

 Database administrators (DBAs)


 Design logical/physical schemas

 Handle security and authorization

 Data availability, crash recovery

 Database tuning as needs evolve

…must understand how a DBMS works


Summary (part 1)

 DBMS used to maintain, query large datasets.


 can manipulate data and exploit semantics
 Other benefits include:
 recovery from system crashes,
 concurrent access,
 quick application development,
 data integrity and security.
 Levels of abstraction provide data independence
 Key when dapp/dt << dplatform/dt
Summary, cont.
 DBAs, DB developers the
bedrock of the information
economy

• DBMS R&D represents a broad,


fundamental branch of the science
of computation

You might also like