0% found this document useful (0 votes)
0 views

Lecture1-Introduction

The document is an introduction to databases and database management systems (DBMS), explaining the importance of data management in various fields. It outlines the functionalities of a DBMS, including data creation, protection, and analysis, as well as the need for efficient data handling in applications like online book purchasing platforms. The course will cover both user and system perspectives on DBMS, including practical projects to build applications that utilize databases.

Uploaded by

dogiathuyasd18
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Lecture1-Introduction

The document is an introduction to databases and database management systems (DBMS), explaining the importance of data management in various fields. It outlines the functionalities of a DBMS, including data creation, protection, and analysis, as well as the need for efficient data handling in applications like online book purchasing platforms. The course will cover both user and system perspectives on DBMS, including practical projects to build applications that utilize databases.

Uploaded by

dogiathuyasd18
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Instructor: Krystian Wojtkiewicz

School of Computer Science and Engineering


International University, VNU-HCMC Lecture 1:
INTRODUCTION
Blackboard IU

MS Teams

Please check frequently for


updates!
The following
slides are
ACKNOWLEDGEMENT referenced from
Dr. Sudeepa Roy,
Duke University.
Learn about “databases” or data management…
Data are individual facts,
statistics, or items of
information, often numeric. In a
more technical sense, data are a
set of values of qualitative or
quantitative variables about one
or more persons or objects,
while a datum is a single value
of a single variable.
https://en.wikipedia.org/
https://en.wikipedia.org/

6
▪ In computing, a database is an organized
collection of data stored and accessed
electronically. Small databases can be stored
on a file system, while large databases are
hosted on computer clusters or cloud storage.
https://en.wikipedia.org/

7
WHAT IS DATABASE
MANAGEMENT (DBMS)?
▪ A database management system (DBMS) is
the software that interacts with end users,
applications, and the database itself to
capture and analyze the data. The DBMS
makes it possible for end users to create,
protect, read, update and delete data in a
database.
https://en.wikipedia.org/

8
Why do we care about data?
… The three years of
gathering and analyzing
data culminated in what
U.S. Sailing calls their “Rio
Weather Playbook,” a body
of critical information
about each of the seven
courses only available to
the U.S. team…
— FiveThirtyEight, “Will Data
Help U.S. Sailing Get Back On
The Olympic Podium?”
Aug 15, 2016
Data =
Money Information Power
Fun
in
Science, Business, Politics,
Security, Sports, Education, ….
Wait.. don’t we need to take a Machine Learning or
Stat course for those things?
Yes, but..

Pic: https://www.technobuffalo.com/sites/technobuffalo.com/files/styles/xlarge/public/wp/2012/05/confused-student.jpg
... we also need to manage this (huge or not-so-huge) data!

9
• E.g., your own version of book purchase platform (like
a mini-Amazon)
• Large data! (think about all books in the world or
even in English)

•How do we start?

* You are going to do something similar in the course project!


▪ At least two types:
▪ Database admin (assuming they
own all copies of all the books)
▪ Users who purchase books
▪ Let’s proceed with these two only

▪ Other people:
▪ Sellers
▪ HR
▪ Finance
▪ Warehouse handlers
▪ …
▪ i.e. what the interface look like? (think about Amazon)

15
▪ i.e. what the interface look like? (think about Amazon)

1. Search for books


• With author, title, topic, price range, ….
2. Purchase books
3. Bookmark/add to wishlist

16
1. Returns books as searched by the authors

2. Check that the payment method is valid

3. Update no. of copies as books are sold

4. Manage total money it has

5. Add new books as they are published

6. ….
▪ Should be able to handle a large amount of data
▪ Should be efficient and easy to use (e.g., search with authors as well as title)
▪ If there is a crash or loss of power, information should not be lost or inconsistent
▪ Imagine a user was in the middle of a transaction when a crash happened, paid the
money, but the book has not been purchased
▪ No surprises with multiple users logged in at the same time
▪ Imagine one last copy of a book that two users are trying to purchase at the same
time
▪ Easy to update and program
▪ For the admin
How about C++, Java, or Python?
On data stored in large files
https://i1.wp.com/dynamiclandscapes.vita-learn.org/wp-content/uploads/2019/05/Lets-code.jpg?resize=768%2C432&ssl=1
James Morgan#Durham, NC
... ...
A tale of two cities#Charles Dickens#3.50#7
To Kill a Mockingbird#Harper Lee#7.20#1
Les Miserables#Victor Hugo#12.80#2
... ...

• Text files – for books, customer, …


• Books listed with title, author, price, and no. of
copies
• Fields separated by #’s
James Morgan#Durham, NC
... ...
A tale of two cities#Charles Dickens#3.50#7
To Kill a Mockingbird#Harper Lee#7.20#1 Les Miserables#Victor
Hugo#12.80#2
... ...

• James Morgan wants to buy “To Kill a Mockingbird”


• A simple script Better idea than scanning?
• Scan through the books file
• Look for the line containing “To Kill a Mockingbird”
• Check if the no. of copies is >= 1 Binary search! Keep
file sorted on titles
• Bill James $7.20 and reduce the no. of copies by 1
What if he changes the “query” and wants to buy a book by Victor
Hugo?
Should be Should be If there is a No surprises Easy to
able to efficient and crash or loss with multiple update and
handle a easy to use of power, users logged program
large amount (e.g., search information in at the
of data with authors should not be same time
as well as lost or
title) inconsistent
Imagine a user Imagine one last For the admin
was in the middle copy of a book
of a transaction that two users are
when a crash trying to purchase
happened, paid at the same time
the money, but
the book haas not
been purchased
Imagine adding a new book
or updating Copies (+ allow
Try to open Try to search Imagine search) on a
a 10-100 both on a programmer’s
task 10-100 GB text file
GB file large flat file
• DBMS = Database Management System

25
In an easy-to-code, efficient, and robust way
• Should be able to handle a large amount of data
• Should be efficient and easy to use (e.g., search with
authors as well as title)
• If there is a crash or loss of power, information should not be
lost or inconsistent
• Imagine a user was in the middle of a transaction when a crash
happened, paid the money, but the book has not been purchased
• No surprises with multiple users logged in at the same time
• Imagine one last copy of a book that two users are trying to
purchase at the same time
• Easy to update and program * We will learn
• For the admin these in the course!
Note: Not always the “standard” DBMS (called Relational DBMS),
but we need to know pros and cons of all alternatives
▪ How can a user use a DBMS (programmer’s/designer’s
perspective)
▪ Run queries, update data (SQL, Relational Algebra)
▪ Design a good database (ER diagram, normalization)
▪ Use different types of data (Mostly relational, also
XML/JSON)
▪ How does a DBMS work (system’s or admin’s perspective)
▪ Storage, index
▪ Query processing, join algorithms, query optimizations
▪ Transactions: recovery and concurrency control
▪ Glimpse of advance topics and other DBMS
▪ NOSQL, Spark (big data)
▪ Data mining
▪ Hands-on experience in class projects by building an
end-to-end website or an app that runs on a database
Thank you for your attention!

You might also like