0% found this document useful (0 votes)
56 views

CS-220 Database Systems (Fall 21) : Muneer Ahmad

This document provides an overview of the CS-220 Database Systems course for Fall 2021. It introduces the instructor, Muneer Ahmad, and lists some key course details. These include required textbooks, the class policy on attendance and grading breakdown, and lecture objectives to understand database systems and their importance. It also provides a brief introduction to databases and database management systems, explaining what they are and some recent trends in large-scale databases like YouTube, self-driving cars, and ATMs.

Uploaded by

Ramish Saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

CS-220 Database Systems (Fall 21) : Muneer Ahmad

This document provides an overview of the CS-220 Database Systems course for Fall 2021. It introduces the instructor, Muneer Ahmad, and lists some key course details. These include required textbooks, the class policy on attendance and grading breakdown, and lecture objectives to understand database systems and their importance. It also provides a brief introduction to databases and database management systems, explaining what they are and some recent trends in large-scale databases like YouTube, self-driving cars, and ATMs.

Uploaded by

Ramish Saeed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Lecture

1
CS-220 Database
Systems (Fall
21)
BESE-11B
Lecture
▸ Class Introduction
Outline ▸ Class Policies
▸ What is a Database?
▸ Recent Trends in Database
Muneer Ahmad
Associate Professor in Computer Science
PhD – (Universiti Teknologi PETRONAS UTP, Malaysia)
About the Research interests
Data Science, Machine Learning, Signal processing,
Bioinformatics, Applied research
Instructor
▸ Consultation Timing
▹ 12:00 noon -1:00 pm Monday
▸ Email
[email protected]
▸ Text Book:
▹ Silberschatz, Korth and Sudarshan (2010):
Database System Concepts 6/E, McGraw-Hill
Course
Books ▸ Reference Book:
▹ Hoffer, Prescott, and McFadden (2008): Modern
Database Management 9/E, Prentice Hall.
▹ Ramakrishnan and Gehrke (2003): Database
Management Systems 3/E, McGraw-Hill

Book Reading is essential for understanding of


the lecture
▸ LMS:
▹ Course Outline (Will explain structure of the course
and project details soon later)
Class ▹ Lecture
▹ Assignment
Policy- ▹ Labs/Lab Submission
Lecture
Resources
▸ Qalam:
▹ Attendance (Strict rules missing two
consecutive classes can result into warning—
provide justification with solid proofs for
missing out the session)
▹ Grading
▸ Tentative Breakup
▸ Lecture (3 Credit Hrs./Week-75%)
▹ Quizzes announced (so no retake-10%)
▹ Assignments (submission mostly on LMS-10%)
Class ▹ OHT-1 & 2 (20% each)
Policy- ▹ Final Exam (40%)
Course
Breakup
▸ Lab (3 Hrs./Week-25%)
▹ Lab Tasks (75%)
▹ Lab Project (30%)

Zero tolerance for plagiarism


Lecture Objectives
Lecture ▸ To understand Database systems
Objectives and its importance in modern days
applications.
▸ To know your course
What is a Database?
▸ Data is the collection of raw fact and
figures.
▸ Data is important as useful
findings can be extracted from them.
Data- An Asset
▸ When data get processed to extract useful
findings they become Information.

Data Processing Information

▸ Data is an important asset for an


organization.
File Processing System
 Before database company’s maintain their record
through computer based file processing systems.

 Data is stored in files on disks and tapes.

 Each department has its own set of files, not


related/linked to other departments files.

 The system has many drawbacks.


File Processing Drawbacks
System
 Data redundancy and inconsistency
 Multiple file formats, duplication of information in
different files
 Difficulty in accessing data
 Need to write a new program to carry out each new
task
 Data isolation — multiple files and formats
 Integrity problems
 Integrity constraints (e.g., account balance > 0)
become “buried” in program code rather than being
stated explicitly
 Hard to add new constraints or change existing ones
File Processing Drawbacks
System
 Atomicity of updates
 Failures may leave database in an inconsistent state with partial
updates carried out
 Example: Transfer of funds from one account to another should
either complete or not happen at all’

 Concurrent access by multiple users


 Concurrent access needed for performance
 Uncontrolled concurrent accesses can lead to inconsistencies
 Example: Two people reading a balance (say 100) and updating
it by withdrawing money (say 50 each) at the same time

 Security problems
 Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
▸ Database is an organized collection of related data
that is stored in efficient and compact manner.

▸ Database Applications:
Database Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Human resources: employee records, salaries, tax
deductions
Etc

▸ Databases can be very large and can touch all aspects


of our lives.
▸ DBMS or Database System comprises of:
Collection of interrelated data (database)
Set of programs to access the data (tools for
managing data)
An environment that is both convenient and efficient to
Database use (interface to access and manage the database)
Managem
ent ▸ Few important concepts/features/components related
System- to database systems are:
DBMS Data Abstraction
Instances & Schemas
Data Independence
Data Models
Recent Trends in
Database
Which Difference b/w database & data
things structure
are What about data structures from algorithms class?
there?
Spreadsheets? Text files?
How they
are
different So many design choices these
from days?
SQL? Big data? (MapReduce, Hadoop, Spark)
database Learning systems? (TensorFlow, deep learning)
○ Peta scale data systems

○ Building new data systems, products

○ Scaled to billions of consumers, billions of ad $s, millions


Recent of web publishers, trillions of data rows, million QPS
systems
Trends ○ E.g., AdSense, Search, Dremel/BigQuery, Gmail/Google
Apps, Sitemaps, Warp, Google Maps, etc.

○ Let’s explore some of the recent examples from the


perspective of the database system
Example: Youtube DB
Result list
Video & description,
Read
Lore
m
ipsum # Views, Likes

Example

Unpack
YouTube
Youtube DB
DB
Modify
Lorem
Learn
Lore
m
ipsum ipsum

Related videos Upload


Relevant ads Like, Review
Every
minute
on the
Interne
t
Example

Self Driving Cars


Front panel metrics
Read
Lore
m Speed, distance
ipsum
ETA
Example

Unpack Cars DB
Cars congue

DB
Modify
Lore Lorem
Learn
m ipsum

Collisions Road models


Traffic signals Drive models
Example

Unpack
ATM DB:

Transac
tion Read Balance Read Balance
Give money vs Update Balance
Update Balance Give money
Platform to store, manage data Supporting
Scale
Speed
Read
Lore
m
ipsum Stability
Evolution
Goals of
Reliability
Databases
Cost
Operations efficiency
Any DB

Learn Modify
Lorem Lorem
ipsum
Connect one/many DBs for custom system

Tune for custom system L (like a lego blo ck)


L

Goals of
Data
systems

Store current data Optimize historical Run batch


(e.g., lot of reads) data (e.g., logs) Workloads
(e.g. training)
When to
build a
custom
data
system?
How?

Example
Mobile Game
Report &
Share
Business/Product

Game App
Real-Time Analysis
DBMS
User Events
DB

DB v0 Q1: 1000 users/sec? Q7: How to model/evolve game data? Q4: Which user cohorts?
Q2: Offline? Q8: How to scale to millions of users? Q5: Next features to build
Q3: Support v1, v1’ Q9: When machines die, restore game Experiments to run?
versions? state gracefully? Q6: Predict ads demand?

App designer Systems designer Product/Biz designer


How?

Example Mobile Game Data


Exploration
Report &
Share

Game App
Cloud Datalab Business/Product
Real-Time DB
4 Analysis
User Events
Data Processing Analytics
1
E-T-L, Dataflow Engine
BigQuery
2 3

Data system
“v1” on
Cloud

1 Log user actions 3 Run queries in a peta


scale analytics system
2 Store in DB, after
Extract-Transform-Load 4 Visualize query results
How?

Example
Game App Mobile
Game
Data
Exploration
Cloud Datalab
Report &
Share
DB Business/Product
4 Analysis
Data Sync
Data Processing Analytics
1 MySQL, Dataflow Engine
BigQuery

Data system
2 3

0 Real-Time
User Events

“v2” Cloud +
Local Local DB

0 Log user actions 3 Run queries in a petabyte


In local DB scale analytics system

1 Data sync to cloud 4 Visualize query results

2 Store in DB, after ETL


Thank You!

You might also like