SQL Database
Data Science Bootcamp
Why should we learn databases?
Databases
Spreadsheets Dashboar
d
What can databases do?
Some databases also
have data visualization
features
Store Analyze Present
SQL is a fundamental skill for data analyst
SQL is easy to learn, Quick win!
Average salary in USA
Data Analyst with SQL Skills Salary | PayScale
Convert to T HB
* 30 / 2.5
You can find your net
worth using this free tool
There are many versions of SQL
We’ll teach PostgreSQL too :)
SQLite is a standard
language (pure)
State of Data Science and Machine Learning 2020 | Kaggle
We have been using SQL for 50 years
Structured Query Language
verb. Ask a question about our data
How database work?
Query
Result
Every company has databases
(99.99%)
What database looks like?
You are already familiar with
database technology
Customers Sales Invoices Products
At the heart of SQL is selecting the data you want
Select columns
Filter rows
ER Diagram Entity
Relationship
SQLite Tutorial - An Easy Way to Master SQLite Fast
Primary and foreign keys
Table: Albums
Column Type
Albumid Integer
Table: Artists Title Text
Column Type Artistid Integer
Artistid Integer
Name Text
SQL clauses we use in our data analyst role
Clauses What it does?
SELECT Select columns
FROM From table
JOIN Join multiple tables
WHERE Filter data
Aggregate Functions AVG SUM MIN MAX COUNT
GROUP BY Group by statistics
HAVING Filter groups
ORDER BY Sort data
The first part will cover all the SQL basics :)
Select all columns
SELECT * FROM customers;
Select all columns
Table name
SELECT * FROM customers;
Upper case Close with ;
Select specific columns
SELECT
firstname,
lastname,
email,
country
FROM customers;
Choose specific columns
Filter rows with where clause
condition
SELECT *
FROM customers
WHERE country = ‘USA’;
SELECT *
FROM customers
WHERE country IN (‘USA’, ‘Canada’, ‘United Kingdom’);
Review join types
Inner Join Left Join Inner and Left Join contribute
around 90-95% of our work
Right Join Full Join
Inner join
Table 1 Table 2 Result Set
PK_ID Name FK_ID Major PK_ID Name Major
1 David 1 Econ 1 David Econ
2 John 2 Econ 2 John Econ
+ =
3 Marry 5 Data 5 Kevin Data
4 Anna 12 Engineer
5 Kevin 35 Mkt
Left join
Table 1 Table 2 Result Set
PK_ID Name FK_ID Major PK_ID Name Major
1 David 1 Econ 1 David Econ
2 John 2 Econ 2 John Econ
+ =
3 Marry 5 Data 3 Marry NULL
4 Anna 12 Engineer 4 Anna NULL
5 Kevin 35 Mkt 5 Kevin Data
Full join
Table 1 Table 2 Result Set
PK_ID Name FK_ID Major PK_ID Name Major
1 David 1 Econ 1 David Econ
2 John 2 Econ 2 John Econ
+ =
3 Marry 5 Data 3 Mary NULL
4 Anna 12 Engineer 4 Anna NULL
5 Kevin 35 Mkt 5 Kevin Data
12 NULL Engineer
35 NULL Mkt
Join example
SELECT A.*, B.* This query joins two tables -
FROM customers A customers and invoices
JOIN invoices B
ON A.customerid = B.customerid;
Join example
SELECT
A.*,
B.*, It’s easy to join more than two
C.*, tables (4-5 tables is quite normal)
D.*
FROM table1 A
JOIN table2 B ON A.id = B.id
JOIN table3 C ON B.id = C.id
JOIN table4 D ON C.id = D.id;
Warning! The following slides are advanced
SQL topics. Be prepared and hope you enjoy.
Repl.it
Repl.it online editor
Essential command lines
Command What it does?
.help เรียกดูชื่อ sql commands ทั้งหมด
.open เปิดไฟล์ database
.read อ่านไฟล์ sql script
.mode เปลี่ยน mode การแสดงผล
.header แสดงชื่อ column ใน terminal
.table แสดงชื่อ tables ใน database
.schema แสดง schema ของตารางทั้งหมด
.import นําเข้า csv file เป็น table
We can also run SQL in terminal
We call this “T ERMINAL”
Join syntax
Alias or shorter name
SELECT A.*, B.*
FROM customers A
JOIN invoices B
ON A.customerid = B.customerid;
PK = FK
Join using
SELECT A.*, B.*
FROM customers A
JOIN invoices B
USING (customerid);
Using if the column names in
both tables are the same
Join more than one column
SELECT A.*, B.*
FROM tableA A
JOIN tableB B
ON A.customerid = B.customerid
AND A.country = B.country;
use AND to add more columns to join
Review CASE syntax
Condition Value if True
SELECT
CASE
WHEN company IS NULL THEN ‘End Customers’
ELSE ‘Corporate’
END AS segment
FROM customers;
Value if False
Case + Aggregate Functions
SELECT
CASE
WHEN company IS NULL THEN ‘End Customers’
ELSE ‘Corporate’
END AS segment,
COUNT(*) AS N
FROM customers
GROUP BY 1;
Count customers in each segment
Intro to subqueries
Outer Query
Inner Query
Inner run first
SELECT firstname, lastname, country FROM (
SELECT * FROM customers
WHERE country IN (‘USA’, ‘United Kingdom’, ‘Canada’)
)
ORDER BY 3 DESC;
Inner Query
Outer run later
SELECT firstname, lastname, country FROM (
SELECT * FROM customers
WHERE country IN (‘USA’, ‘United Kingdom’, ‘Canada’)
)
ORDER BY 3 DESC;
Outer Query
We often use subqueries in where clause
SELECT * FROM tracks
WHERE bytes = (
SELECT MAX(bytes) FROM tracks
);
Find max bytes
Intro to window functions
Two things you need to know
about window functions
1. they are in SELECT clause
2. they create new columns in
the result table
Aggregate vs. Window functions
Single Value Column
Aggregate Function Window Function
Window function syntax
WINDOW_FUNCTION( ) OVER( PARTITION BY … ORDER BY … )
Function Name Create window
The easiest window function
SELECT
firstname,
lastname,
ROW_NUMBER() OVER() AS rowNum
FROM customers;
Common window functions
Function What it does?
ROW_NUMBER() สร้างคอลัมน์ row number เรียงตั้งแต่ 1 - n
RANK() สร้างคอลัมน์ ranking
DENSE_RANK() สร้างคอลัมน์ ranking
LAG() สร้างคอลัมน์ LAG value (t-1)
LEAD() สร้างคอลัมน์ LEAD value (t+1)
NTILE() สร้างคอลัมน์ segment จับกลุ่ม records
SUM() OVER() สร้างคอลัมน์ผลรวมแบบ running total
Bootcamp Live 02
Intro to Databases with
Website: https://datarockie.com