Key terms Intro to SQL, Select Statement overview Conditional Functions - IIF SELECT STRFTIME(‘%m’, ‘2024-05-02’);
Intro to SQL, WHERE – IN
Structured Query Language: A language for querying and SELECT Column Names SELECT country, total_population, -- This will output 05
SELECT *
manipulating data. FROM Table Name IIF(total_population > 250000, ‘High’, ‘Medium’) AS category SELECT STRFTIME(‘%Y’, ‘2024-05-02’);
FROM population
Data Definition Language: Create and modify structures in WHERE Condition FROM population; -- This will output 2024
WHERE country
the database (tables, views, indexes, etc.) GROUP BY Column Names /*Groups rows using column SELECT STRFTIME(‘%d/%m/%Y’, ‘2024-05-02’);
IN (‘India’, ‘China’); Simple CASE Expression
Data Manipulation Language: SELECT, INSERT, values*/ -- This will output 2/05/2024
UPDATE, DELETE etc. Used to store, modify, retrieve, Intro to SQL, WHERE – Multiple Conditions SELECT country, total_population,
HAVING Condition /*Filter groups*/ SELECT STRFTIME(‘%H:%M’, ‘2024-05-02 10:00’);
delete and update data in database. SELECT * CASE country
ORDER BY Column Names -- This will output 10:00
Data Control Language: Rights, permissions and other FROM population WHEN ‘India’ THEN ‘Asia’ Inner Join (default join)
LIMIT Number of Records;
controls of the database system. WHERE continent = ‘Asia’ WHEN ‘China’ THEN ‘Asia’, Aggregate Functions – MAX SELECT <select_list> FROM TableA A
Data: The information that is stored in the database. Intro to SQL, Column Alias AND total_population > 250000000; WHEN ‘United States’ THEN ‘Americas’ -- This will ignore the nulls in the specified row INNER JOIN TableB B
Database: A collection of data. SELECT UPPER(country) AS country_in_caps WHEN ‘Nigeria’ THEN ‘Africa’ SELECT MAX(total_population) FROM population; ON A.Key = B.Key;
Schema: The structure of the database. The tables, views, FROM population; SELECT * WHEN ‘Brazil’ THEN ‘Americas’
FROM population WHEN ‘Russia’ THEN ‘Europe’ SELECT MAX(total_population) FROM population;
indexes, etc. Intro to SQL, SELECT DISTINCT WHERE total_population < 250000000; Left Outer Join
Table: A collection of data organized in rows and columns. WHERE continent = ‘Americas’ ELSE ‘Unknown’
SELECT DISTINCT OR total_population > 100000000; SELECT <select_list> FROM TableA A
Row: A single record in a table. END AS continent SELECT
country, total_population LEFT JOIN TableB B
Column: A single field in a table. FROM population; MAX(
FROM population; SELECT * ON A.Key = B.Key;
Primary Key: A unique identifier for a row in a table. FROM population Searched CASE Expression ROUND (
Foreign Key: A field in a table that is a primary key in Intro to SQL, FROM WHERE continent = ‘Asia’ SELECT country, total_pop, CAST (total_population AS REAL)
another table. SELECT Column Names AND (total_population > 250000000; CASE country /1000000000, 2
Right Outer Join
View: A virtual table that is the result of a query. FROM OR total_population < 100000000); WHEN total_pop > 250 )
SELECT <select_list> FROM TableA A
Index: A data structure that improves the speed of data Table Name WHEN total_pop > 150 )
Intro to SQL, WHERE – Negated Conditions RIGHT JOIN TableB B
retrieval. View Name WHEN total_pop <= 150 FROM population;
SELECT * ON A.Key = B.Key;
Query: A request for data or information from a database Temp Table Name END AS category
FROM population Aggregate Functions – MIN
table or combination of tables. Join Conditions FROM population;
WHERE country -- This will ignore the nulls in the specified row Full Outer Join
Query Language: A language for requesting information Query Result
NOT IN (‘India’, ‘China’); SELECT MIN(total_population) FROM population; SELECT <select_list> FROM TableA A
from a database. What is NULL - Filtering
Intro to SQL, ORDER BY FULL OUTER JOIN TableB B
Postgres: A relational database management system. SELECT * FROM population Aggregate Functions – SUM
SELECT continent, country, total_population SELECT * ON A.Key = B.Key;
Database Client: A program that allows you to connect to a WHERE total_population IS NULL; -- This will ignore the nulls in the specified row
FROM population FROM population
Postgres database and run queries. SELECT SUM(total_population) FROM population;
ORDER BY WHERE country SELECT * FROM population
Database Server: A program that runs on a computer and
continent ASC, NOT LIKE ‘%ia’; WHERE total_population IS NOT NULL; Aggregate Functions – AVG Left Exclusive Join
manages the database.
country DESC; Filters in UPDATE -- This will ignore the nulls in the specified row SELECT <select_list> FROM TableA A
Database Management System: A program that manages What is NULL - Insertion
UPDATE population SELECT AVG(total_population) FROM population; LEFT JOIN TableB B
the database. Intro to SQL, LIMIT INSERT INTO population
SET total_population = 3421234567 ON A.Key = B.Key
Database Administrator: A person who manages the SELECT * (country, continent, total_population) Aggregate Functions – COUNT
WHERE country LIKE ‘United%’; WHERE B.Key IS NULL;
database FROM population VALUE Variations
Database Developer: A person who creates and maintains ORDER BY country ASC (‘China’, ‘Asia’, NULL); • COUNT(*)
Filters in DELETE Right Exclusive Join
the database. LIMIT 2; • COUNT(Column Name)
DELETE FROM population What is NULL - UPDATE SELECT <select_list> FROM TableA A
Database User: A person who uses the database. • COUNT(Literal Value)
WHERE total_population < 100000000; UPDATE population RIGHT JOIN TableB B
Intro to SQL, WHERE • COUNT(Expression) ON A.Key = B.Key
SET continent = NULL
String Functions - UPPER • COUNT(DISTINCT) WHERE A.Key IS NULL;
Operator Description WHERE country = ‘Russia’;
SELECT UPPER(‘population’);
Easy
Intro to SQL LEGEND =
<
Equal
Less than
SELECT UPPER(country)
FROM population
Conditional Functions - IFNULL
SELECT IFNULL(NULL, ‘Test’); -- This will output ‘Test’
SELECT COUNT(*) FROM population;
SELECT COUNT(*) FROM population WHERE total_population
Full Outer Join
SELECT <select_list> FROM TableA A
Querying Data Medium SELECT IFNULL(‘First’, ‘Test’); -- This will output ‘First’ > 100;
> Greater than WHERE country IN (‘India’, ‘China’, ‘Brazil’); FULL OUTER JOIN TableB B
Filtering Data Simple Functions SELECT country, continent, SELECT COUNT(country) FROM population;
<= Less than or equal to ON A.Key = B.Key
Hard Conditional String Functions - LOWER IFNULL(total_population, ‘Unknown’) AS total_population -- Outputs: 10
WHERE A.Key IS NULL
Subqueries Expressions >= Greater than or equal to SELECT LOWER(‘population’); FROM population; SELECT COUNT(total_population) FROM population; OR B.Key IS NULL;
CTEs Date Functions <> Not equal -- Will ignore nulls in specified row so Output: 9 not 10
Views Aggregate Functions String Functions - CONCAT Conditional Functions - COALESCE
!= Not equal SELECT CONCAT(‘Steve’, ‘Jobs’); -- This outputs:
Window Functions Joins SELECT COALESE(NULL, NULL, 300, 400); -- This will SELECT COUNT(1) FROM population;
SQL Constraints BETWEEN BETWEEN the given range SteveJobs output 300 -- Outputs: 10 as it is from the 1st column
Intro to SQL, Creating Tables LIKE Like the given pattern SELECT COALESE(100, NULL, 300, 400); -- This will
SELECT CONCAT(‘Steve’, ‘ ’, ‘Jobs’); -- This outputs: Aggregate Functions – COUNT DISTINCT
CREATE TABLE population IN Within the list of values given output 100
Steve Jobs -- This will ignore the nulls in the specified row
( SELECT country
SELECT total_population
country VARCHAR(50), SELECT ‘Steve’ || ‘ ’ || ‘Jobs’; -- This outputs: Steve Jobs COALESCE(total_pop_2023, total_pop_2022, SELECT COUNT(DISTINCT continent) FROM population;
FROM population
total_population INT total_pop_2021)
WHERE country = ‘China’; String Functions - REPLACE
); AS total_population GROUP BY
SELECT country SELECT REPLACE(‘SQL stuff’, ‘SQL’, ‘Python’); FROM population; SELECT continent FROM population
Intro to SQL, Best Pratices FROM population -- This outputs: Python stuff WHERE continent != 'Asia'
-- Standard SQL comment Conditional Functions - NULLIF JOIN with Filters
WHERE total_population > 1000000000; String Functions - TRIM GROUP BY continent;
SELECT IFNULL(0, 0); -- This will output NULL SELECT * FROM continent cont
/* SELECT total_population SELECT REPLACE(‘ SQL ’); SELECT IFNULL(100, 200); -- This will output 200 SELECT continent, JOIN country cnty
Multi Line comment FROM population -- This outputs: SQL SELECT country, continent, COUNT(*) AS no_of_countries ON cont.name = cnty.cont_name
Line 1 WHERE country != ‘China’; NULLIF(total_population, 0) AS total_population MAX(total_population) AS highest_population AND cnty.population <= 100;
SELECT REPLACE(‘ SQL stuff ’);
Line 2 FROM population; FROM population
-- This outputs: SQL stuff
*/ Intro to SQL, WHERE – BETWEEN GROUP BY continent; LEFT OUTER JOIN with Filters
SELECT * String Functions – SUBSTR(string, start, length) Date Manipulation Functions **PLEASE NOTE CASE 1 and 2 can produce different
CREATE TABLE population FROM population SELECT continent, income_level, results as, with 2 potentially producing rows of null
SELECT SUBSTR(‘ SQL stuff’, 1, 3); Modifiers Datestring
(country VARCHAR(50), total_population INT); WHERE total_population COUNT(*) AS no_of_countries values due to cartesian product, rmb the order of
-- This outputs: SQL • NNN days • YYYY-MM-DD
BETWEEN 300000000 AND 400000000; • NNN months • YYYY-MM-DD HH:MM FROM population operation**
Intro to SQL, Inserting Data
SELECT SUBSTR(‘ SQL stuff’, 5); • NNN years • YYYY-MM-DD HH:MM:SS GROUP BY continent, income_level;
-- Values has to be inserted in a proper order Intro to SQL, WHERE – LIKE CASE 1: As a where clause
INSERT INTO population -- This outputs: stuff • start of month • YYYY-MM-DD
SELECT * HAVING SELECT * FROM continent cont
(country, total_population) FROM population • start of year HH:MM:SS.SSS SELECT continent, COUNT(*) AS no_of_countries
String Functions – INSTR(String, Substring) LEFT JOIN country cnty
VALUES WHERE country Wildcard Description • weekday N • YYYY-MM-DDTHH:MM FROM population GROUP BY continent HAVING COUNT(*) >
SELECT SUBSTR(‘ SQL stuff’, ‘stuff’); ON cont.name = cnty.cont_name
(‘China’, 1425671352), LIKE ‘B%’; % Zero, one of more • localtime • YYYY-MM-DDTHH:MM:SS 1;
-- This outputs: 5, the string position WHERE cnty.population <= 100;
(‘United States’, 339996564); characters • utc • YYYY-MM-
SELECT * String Functions – LENGTH DDTHH:MM:SS.SSS SELECT continent, COUNT(*) AS no_of_countries CASE 2: In the on clause
Intro to SQL, Selecting Data _ Exactly one character SELECT DATE('now’); FROM population GROUP BY continent
FROM population SELECT LENGTH(‘ SQL stuff’); • now SELECT * FROM continent cont
SELECT country, total_population (underscore) SELECT DATE(2024-05-02’); HAVING no_of_countries > 1;
WHERE country -- This outputs: 9 LEFT JOIN country cnty
FROM population; LIKE ‘Br%’; SELECT DATE('2024-05-02', '2 days’);
ON cont.name = cnty.cont_name
SELECT * Math Functions – ABS -- This will output 2024-05-04 JOIN
SELECT * AND cnty.population <= 100;
FROM population; SELECT ABS(-123); SELECT DATE('2024-05-02', ‘-2 days’); -- Joins happen using cartesian product so a 2 row table
FROM population -- This outputs: 123 Math Functions – -- This will output 2024-03-30 inner joined
Intro to SQL, Filtering Data WHERE country Managing Tables
POWER SELECT DATE('2024-05-02', 'start of year'); -- With a 3-row table might possibly give 6 rows
SELECT total_population LIKE ‘%ia’; SELECT ABS(456); Table Definition Constraints:
SELECT ABS(5, 2); -- This will output 2024-01-01
FROM population -- This outputs: 456 SELECT DATE('2024-05-02', 'start of year', ‘5 days'); SELECT country.name AS cnty_name, CREATE TABLE population( • NOT NULL
WHERE country = ‘China’; SELECT * -- This outputs: 25
Math Functions – -- This will output 2024-01-06 country.count_name, country VARCHAR(50), • UNIQUE
FROM population Math Functions – ROUND
SQRT country.population AS cnty_population, total_population INT • CHECK
Intro to SQL, Updating Data WHERE country SELECT ABS(123.45677, 2); Date Formatting Function Format
SELECT SQRT(25); continent.population ); • DEFAULT
UPDATE population LIKE ‘In%ia’; -- This outputs: 123.46
-- This outputs: 5 %d Day of the month 01-31 FROM continent AS cont JOIN country AS cnty
SET total_population = 341234567
SELECT * Math Functions – CEIL Math Functions – %m month 01-12 ON continent.name = country.cont_name; Managing Tables – NOT NULL CONSTRAINT
WHERE country = ‘China’;
FROM population SELECT CEIL(21.2); MOD %Y year 0000-9999 SELECT country.name AS cnty_name,
Intro to SQL, Deleting Data -- This outputs: 22 CREATE TABLE population(
WHERE country SELECT ABS(27, 5); country.count_name,
DELETE %W Week of the year 00-53 country VARCHAR(50) NOT NULL,
LIKE ‘In_ia’; Math Functions – FlOOR -- This outputs: 2 country.population AS cnty_population,
FROM population %H Hour 00-24 total_population INT
SELECT * SELECT ABS(21.2); continent.population );
WHERE country = ‘Russia’; %M minute 00-59 FROM continent cont
FROM population -- This outputs: 21 -- This will prevent the below
Intro to SQL, Dropping Tables WHERE country %S seconds 00-59 JOIN country cnty INSERT INTO population (country, total_population)
DROP TABLE population; LIKE ‘%i_’; %j Day of the year 001-366 ON continent.name = country.cont_name; VALUES (NULL, 1000000);
Managing Tables – DEFAULT CONSTRAINT Subquery – joins Views – Create views from CTEs Window Functions – Ranking, RANK OVER
SELECT cnty.country, cnty.continent, WITH cte_continent AS RANK()OVER(ORDER BY volume DESC)
CREATE TABLE population( cnty.total_population AS cnty_population, (SELECT continent, RANK() OVER(PARTITION BY company, year ORDER BY
country VARCHAR(50) NOT NULL, cont.cont_population SUM(total_population) AS cont_population volume DESC)
total_population INT DEFAULT 0 FROM population cnty FROM population
); Window Functions – Analytical, LAG OVER
JOIN ( GROUP BY continent)
-- This will cause value inserted to be (‘India’, 0) -- LAG(expr, [offset], [default])
SELECT continent, SELECT cnty.country,
INSERT INTO population (country, total_population) LAG(volume)OVER(ORDER BY company, year, quarter)
SUM(total_population) AS cont_population cnty.continent,
VALUES (’India’, NULL); FROM population cnty.total_population AS cnty_population, Window Functions – Window Frame
GROUP BY continent cont.cont_population, SELECT company, year, quarter, volume,
Managing Tables – CHECK CONSTRAINT
) cont FROM population cnty SUM(volume) OVER(PARTITION BY company, year
CREATE TABLE population( ON cnty.continent = cont.continent; JOIN cte_continent cont ORDER BY quarter
country VARCHAR(50) NOT NULL, Correlated Subsquery ON cnty.continent = cont.continent; ROWS BETWEEN UNBOUNDED PRECEEDING AND
total_population INT CHECK (total_population Simple Subquery: Standalone SQL executed only once CURRENT ROW) AS sales_to_date
CREATE VIEW vw_continent
<= 1500) Correlated Subquery: Depends on main query and FROM sales;
AS
); executed for each record from result of the main query. SELECT continent, Window Functions – Window Frame, Typical structure
-- This will prevent the below -- Here we go through each record from the population SUM(total_population) AS cont_population ROWS BETWEEN Lower Boundary AND Upper Boundary
INSERT INTO population (country, total_population) table FROM population; RANGE
VALUES (’India’, 200000); -- Then we get the highest population from the continent
Managing Tables – UNIQUE CONSTRAINT --Then we check whether the population from the Views – Using views in subquery Lower Boundary values:
previous 2 steps CREATE VIEW vw_continent • CURRENT ROW
CREATE TABLE population( Eg: AS • N PRECEDING
country VARCHAR(50) UNIQUE, SELECT m.country, m.continent, m.total_population SELECT continent, • N FOLLOWING
total_population INT); FROM population m SUM(total_population) AS cont_population
-- This will prevent the below as India is already inside Window Functions – Window Frame, CURRENT &
WHERE m.total_population = (SELECT FROM population
INSERT INTO population (country, total_population) FOLLOWING
MAX(s.total_population) GROUP BY continent;
VALUES (’India’, 200000); -- Looks 2 rows ahead relative to current row
FROM population s
SELECT cnty.country, ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING
Relational Database Management System (RDBMS) WHERE s.continent = m.continent);
cnty.continent,
Common Table Expressions (CTEs) cnty.total_population AS cnty_population, Window Functions – Window Frame, PRECEDING &
CREATE TABLE country(
cont.cont_population, FOLLOWING
id INT PRIMARY KEY, WITH cte_continent AS
FROM population cnty -- Looks 2 rows ahead relative to row before current row
name VARCHAR(50) NOT NULL, (SELECT continent,
JOIN vw_continent cont ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
population INT NOT NULL); SUM(total_population) AS cont_population
ON cnty.continent = cont.continent; Window Functions – Window Frame, CURRENT &
CREATE TABLE city( FROM population
GROUP BY continent), UNBOUNDED
id INT PRIMARY KEY, Views – Creating views from joins -- Looks all the way to the last row in table relative to
name VARCHAR(50) NOT NULL, cte_world AS
CREATE VIEW vw_country_with_continent current row
population INT NOT NULL, (SELECT SUM(total_population) AS world_population
AS ROWS BETWEEN CURRENT ROW AND UNBOUNDED
Country_id INT NOT NULL, FROM population)
SELECT cnty.country, cnty.continent, FOLLOWING
FOREIGN KEY(country_id) REFERENCES country(id) SELECT cnty.country,
cnty.population AS cnty_population,
); cnty.continent,
cont.population AS cont_population Window Functions – Window Frame, example
cnty.total_population AS cnty_population,
Modifying a Table – Rename a Table FROM continent cont SUM(volume) OVER(
cont.cont_population,
JOIN country cnty ORDER BY year, quarter
world.world_population
ALTER TABLE continent ON (cont.name = cnty.cont_name); ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
FROM population cnty
RENAME TO continents; AS rolling_half_yly
JOIN cte_continent cont Window Functions – Intro
Modifying a Table – Add Column ON cnty.continent = cont.continent • Operates on a group of rows (window) and return a
JOIN cte_world world; value for each input row based on the group of
ALTER TABLE continent
WITH cte_continent AS rows.
ADD COLUMN area INT; -- This adds column called ‘area’
(SELECT continent, Window Functions – AVG
Modifying a Table – Rename Column SUM(total_population) AS cont_population -- Eg: no windows
ALTER TABLE continent FROM population SELECT company, year, quarter, volume, avg_volume
RENAME COLUMN name TO cont_name; GROUP BY continent), FROM sales
-- This renames column from ‘name’ to ‘cont_name’ cte_continent_avg AS JOIN (
(SELECT AVG(cont_population) AS cont avg_population SELECT company, year,
Modifying a Table – Change Data Type FROM cte_continent) AVG(volume) AS avg_volume
SELECT cnty.country, FROM sales
ALTER TABLE continent
cnty.continent, GROUP BY company, year) avg_sales
MODIFY COLUMN population BIGINT;
cnty.total_population AS cnty_population, ON sales.company = avg_sales.company
Modifying a Table – Drop Column cont.cont_population, AND sales.year = avg_sales.year;
cont_avg.cont_avg_population
ALTER TABLE continent FROM population cnty -- Eg: same thing with windows
DROP COLUMN population; JOIN cte_continent cont SELECT company, year, quarter, volume,
ON cnty.continent = cont.continent AVG(volume) OVER(PARTITION BY company, year) AS
Modifying a Table – Drop an entire table
JOIN cte_continent_avg cont_avg; avg_volume
DROP TABLE continent; Views – Purpose FROM sales;
• Data Security & organize complex query Window Functions – SUM
Modifying a Table – DELETE CELLS/ some of them
-- Create View SELECT company, year, quarter, volume,
DELETE FROM continent; CREATE VIEW View Name SUM(volume) OVER(PARTITION BY company, year) AS
DELETE FROM continent WHERE name=‘Europe’; AS sales_to_date
SELECT Column Names FROM sales;
Subquery – Filtering the data FROM Table Names;
-- Let us compare: -- Drop view Window Functions – RANK
SELECT country DROP VIEW View Name; -- Ranks highest sales as 1 and lowest as 4
FROM population -- Create or replace view SELECT company, year, quarter, volume,
WHERE continent = 'Asia' CREATE OR REPLACE VIEW View Name RANK() OVER(PARTITION BY company, year
AND total_population < 500000; AS ORDER BY volumne DESC) AS sales_rank
-- And SELECT Column Names FROM sales;
SELECT country FROM Table Names;
FROM ( SELECT country, total_population Window Functions –Ranking
Aggregate, ranking
Analytical
and analytical
FROM population Views – Creating one functions functions functions
WHERE continent = 'Asia') CREATE VIEW vw_population Aggregate functions• RANK • LEAD
WHERE total_population < 500000 AS • SUM • DENSE_RAN • LAG
-- Both produce similar results like: SELECT country, continent, • AVG K • FIRST_VALUE
SELECT country ROUND(CAST(total_population AS REAL)/1000000, 2) AS • MIN • ROW_NUMB • LAST_VALUE
FROM ( SELECT country, total_population total_population_mil • MAX ER
FROM population FROM Table Names; • COUNT • NTILE
WHERE continent = 'Asia’) cnty_asia Views – Accessing one
WHERE cnty_asia.total_population < 500000 Window Functions – Keywords
SELECT country,
SELECT company, year, quarter, volume,
Subquery – aggregate functions continent,
SUM(volume) OVER(PARTITION BY company, year
SELECT country total_population_mil
ORDER BY quarter
FROM population FROM vw_population;
ROWS BETWEEN UNBOUNDED PRECEEDING AND
WHERE total_population > (SELECT AVG(total_population) CURRENT ROW) AS sales_to_date
FROM population); FROM sales;
Window Functions – Aggregate, SUM OVER
SUM(volume)OVER()
SUM(volume) OVER(PARTITION BY company, year)