Lecture1-Introduction
Lecture1-Introduction
MS Teams
6
▪ In computing, a database is an organized
collection of data stored and accessed
electronically. Small databases can be stored
on a file system, while large databases are
hosted on computer clusters or cloud storage.
https://en.wikipedia.org/
7
WHAT IS DATABASE
MANAGEMENT (DBMS)?
▪ A database management system (DBMS) is
the software that interacts with end users,
applications, and the database itself to
capture and analyze the data. The DBMS
makes it possible for end users to create,
protect, read, update and delete data in a
database.
https://en.wikipedia.org/
8
Why do we care about data?
… The three years of
gathering and analyzing
data culminated in what
U.S. Sailing calls their “Rio
Weather Playbook,” a body
of critical information
about each of the seven
courses only available to
the U.S. team…
— FiveThirtyEight, “Will Data
Help U.S. Sailing Get Back On
The Olympic Podium?”
Aug 15, 2016
Data =
Money Information Power
Fun
in
Science, Business, Politics,
Security, Sports, Education, ….
Wait.. don’t we need to take a Machine Learning or
Stat course for those things?
Yes, but..
Pic: https://www.technobuffalo.com/sites/technobuffalo.com/files/styles/xlarge/public/wp/2012/05/confused-student.jpg
... we also need to manage this (huge or not-so-huge) data!
9
• E.g., your own version of book purchase platform (like
a mini-Amazon)
• Large data! (think about all books in the world or
even in English)
•How do we start?
▪ Other people:
▪ Sellers
▪ HR
▪ Finance
▪ Warehouse handlers
▪ …
▪ i.e. what the interface look like? (think about Amazon)
15
▪ i.e. what the interface look like? (think about Amazon)
16
1. Returns books as searched by the authors
6. ….
▪ Should be able to handle a large amount of data
▪ Should be efficient and easy to use (e.g., search with authors as well as title)
▪ If there is a crash or loss of power, information should not be lost or inconsistent
▪ Imagine a user was in the middle of a transaction when a crash happened, paid the
money, but the book has not been purchased
▪ No surprises with multiple users logged in at the same time
▪ Imagine one last copy of a book that two users are trying to purchase at the same
time
▪ Easy to update and program
▪ For the admin
How about C++, Java, or Python?
On data stored in large files
https://i1.wp.com/dynamiclandscapes.vita-learn.org/wp-content/uploads/2019/05/Lets-code.jpg?resize=768%2C432&ssl=1
James Morgan#Durham, NC
... ...
A tale of two cities#Charles Dickens#3.50#7
To Kill a Mockingbird#Harper Lee#7.20#1
Les Miserables#Victor Hugo#12.80#2
... ...
25
In an easy-to-code, efficient, and robust way
• Should be able to handle a large amount of data
• Should be efficient and easy to use (e.g., search with
authors as well as title)
• If there is a crash or loss of power, information should not be
lost or inconsistent
• Imagine a user was in the middle of a transaction when a crash
happened, paid the money, but the book has not been purchased
• No surprises with multiple users logged in at the same time
• Imagine one last copy of a book that two users are trying to
purchase at the same time
• Easy to update and program * We will learn
• For the admin these in the course!
Note: Not always the “standard” DBMS (called Relational DBMS),
but we need to know pros and cons of all alternatives
▪ How can a user use a DBMS (programmer’s/designer’s
perspective)
▪ Run queries, update data (SQL, Relational Algebra)
▪ Design a good database (ER diagram, normalization)
▪ Use different types of data (Mostly relational, also
XML/JSON)
▪ How does a DBMS work (system’s or admin’s perspective)
▪ Storage, index
▪ Query processing, join algorithms, query optimizations
▪ Transactions: recovery and concurrency control
▪ Glimpse of advance topics and other DBMS
▪ NOSQL, Spark (big data)
▪ Data mining
▪ Hands-on experience in class projects by building an
end-to-end website or an app that runs on a database
Thank you for your attention!