0% found this document useful (0 votes)
33 views

SQL For Data Analysis Cheat Sheet A4

The document provides an overview of common SQL commands and functions for analyzing data. It covers basic queries and selections, aggregation, grouping, joining, rounding, computations, and troubleshooting. Examples are given for each concept to demonstrate usage.

Uploaded by

alialiye33333
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

SQL For Data Analysis Cheat Sheet A4

The document provides an overview of common SQL commands and functions for analyzing data. It covers basic queries and selections, aggregation, grouping, joining, rounding, computations, and troubleshooting. Examples are given for each concept to demonstrate usage.

Uploaded by

alialiye33333
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

SQL for Data Analysis Cheat Sheet

SQL GROUP BY ORDER BY JOIN


SQL, or Structured Query Language, is a language for talking to PRODUCT Fetch product names sorted by the price column in the default JOIN is used to fetch data from multiple tables. To get the names
databases. It lets you select specific data and build complex ASCending order: of products purchased in each order, use:
name category
reports. Today, SQL is a universal language of data, used in SELECT name SELECT
practically all technologies that process data. Knife Kitchen FROM product orders.order_date,
Pot Kitchen category count ORDER BY price [ASC]; product.name AS product,
amount
SELECT Mixer Kitchen Kitchen 3
FROM orders
Jeans Clothing Clothing 3 Fetch product names sorted by the price column in JOIN product
Fetch the id and name columns from the product table:
Sneakers Clothing Electronics 2 DESCending order: ON product.id = orders.product_id;
SELECT id, name
SELECT name
FROM product; Leggings Clothing FROM product
Smart TV Electronics ORDER BY price DESC;
Concatenate the name and the description to fetch the full Learn more about JOINs in our interactive SQL JOINs course.
Laptop Electronics
description of the products:
SELECT name || ' - ' || description COMPUTATIONS
FROM product;
AGGREGATE FUNCTIONS
Use +, -, *, / to do basic math. To get the number of seconds in a
week:
INSERT
Count the number of products: SELECT 60 * 60 * 24 * 7; To insert data into a table, use the INSERT command:
Fetch names of products with prices above 15:
SELECT COUNT(*) -- result: 604800 INSERT INTO category
SELECT name
FROM product; VALUES
FROM product
(1, 'Home and Kitchen'),
WHERE price > 15;
(2, 'Clothing and Apparel');
ROUNDING NUMBERS
Fetch names of products with prices between 50 and 150: Count the number of products with non-null prices: Round a number to its nearest integer:
SELECT name SELECT COUNT(price) SELECT ROUND(1234.56789);
You may specify the columns to which the data is added. The
FROM product FROM product; -- result: 1235
remaining columns are filled with predefined default values or
WHERE price BETWEEN 50 AND 150; NULLs.
Round a number to two decimal places: INSERT INTO category (name)
Count the number of unique category values: VALUES ('Electronics');
Fetch names of products that are not watches: SELECT ROUND(AVG(price), 2)
SELECT COUNT(DISTINCT category)
SELECT name FROM product
FROM product;
FROM product WHERE category_id = 21;
WHERE name != 'watch'; -- result: 124.56
UPDATE
Get the lowest and the highest product price: To update the data in a table, use the UPDATE command:
Fetch names of products that start with a 'P' or end with an SELECT MIN(price), MAX(price) UPDATE category
's': FROM product; TROUBLESHOOTING SET
SELECT name is_active = true,
FROM product
INTEGER DIVISION name = 'Office'
WHERE name LIKE 'P%' OR name LIKE '%s'; In PostgreSQL and SQL Server, the / operator performs integer
WHERE name = 'Ofice';
Find the total price of products for each category: division for integer arguments. If you do not see the number of
SELECT category, SUM(price) decimal places you expect, it is because you are dividing between
Fetch names of products that start with any letter followed by FROM product two integers. Cast one to decimal:
'rain' (like 'train' or 'grain'):
SELECT name
GROUP BY category; 123 / 2 -- result: 61 DELETE
CAST(123 AS decimal) / 2 -- result: 61.5
To delete data from a table, use the DELETE command:
FROM product
DELETE FROM category
WHERE name LIKE '_rain';
Find the average price of products for each category whose WHERE name IS NULL;
average is above 3.0: DIVISION BY 0
Fetch names of products with non-null prices: SELECT category, AVG(price) To avoid this error, make sure the denominator is not 0. You may
SELECT name FROM product use the NULLIF() function to replace 0 with a NULL, which
Check out our interactive course How to INSERT, UPDATE, and
FROM product GROUP BY category results in a NULL for the entire expression:
DELETE Data in SQL.
WHERE price IS NOT NULL; HAVING AVG(price) > 3.0; count / NULLIF(count_all, 0)

LearnSQL.com is owned by Vertabelo SA


Learn the basics of SQL in our interactive SQL Basics course. vertabelo.com | CC BY-NC-ND Vertabelo SA
SQL for Data Analysis Cheat Sheet
DATE AND TIME SORTING CHRONOLOGICALLY INTERVALS EXTRACTING PARTS OF DATES
There are 3 main time-related types: date, time, and Using ORDER BY on date and time columns sorts rows An interval measures the difference between two points in time. The standard SQL syntax to get a part of a date is
timestamp. Time is expressed using a 24-hour clock, and it can chronologically from the oldest to the most recent: For example, the interval between 2023-07-04 and 2023-07- SELECT EXTRACT(YEAR FROM order_date)
be as vague as just hour and minutes (e.g., 15:30 – 3:30 p.m.) or SELECT order_date, product, quantity 06 is 2 days. FROM sales;
as precise as microseconds and time zone (as shown below): FROM sales
ORDER BY order_date;
2021-12-31 14:39:53.662522-05 To define an interval in SQL, use this syntax:
You may extract the following fields:
date time INTERVAL '1' DAY
YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND.
timestamp order_date product quantity
YYYY-mm-dd HH:MM:SS.ssssss±TZ 2023-07-22 Laptop 2 The syntax consists of three elements: the INTERVAL keyword, a
quoted value, and a time part keyword. You may use the following The standard syntax does not work In SQL Server.
14:39:53.662522-05 is almost 2:40 p.m. CDT (e.g., in 2023-07-23 Mouse 3 Use the DATEPART(part, date) function instead.
time parts: YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND.
Chicago; in UTC it'd be 7:40 p.m.). The letters in the above 2023-07-24 Sneakers 10 SELECT DATEPART(YEAR, order_date)
example represent: FROM sales;
2023-07-24 Jeans 3
Adding intervals to date and time values
In the date part: In the time part: 2023-07-25 Mixer 2 You may use + or - to add or subtract an interval to date or
YYYY – the 4-digit HH – the zero-padded hour in a 24-
year. hour clock.
timestamp values. GROUPING BY YEAR AND MONTH
mm – the zero-padded MM – the minutes. Use the DESCending order to sort from the most recent to the Find the count of sales by month:
month (01—January SS – the seconds. Omissible. oldest: Subtract one year from 2023-07-05: SELECT
through 12— ssssss – the smaller parts of a SELECT order_date, product, quantity SELECT CAST('2023-07-05' AS TIMESTAMP) EXTRACT(YEAR FROM order_date) AS year,
December). second – they can be expressed FROM sales - INTERVAL '1' year; EXTRACT(MONTH FROM order_date) AS month,
dd – the zero-padded using 1 to 6 digits. Omissible. ORDER BY order_date DESC; -- result: 2022-07-05 00:00:00 COUNT(*) AS count
day. ±TZ – the timezone. It must start FROM sales
with either + or -, and use two GROUP BY
Find customers who placed the first order within a month from
digits relative to UTC. Omissible. COMPARING DATE AND TIME the registration date:
year,
month
SELECT id
CURRENT DATE AND TIME VALUES FROM customers
ORDER BY
year
Find out what time it is: You may use the comparison operators <, <=, >, >=, and = to WHERE first_order_date >
month;
SELECT CURRENT_TIME; compare date and time values. Earlier dates are less than later registration_date + INTERVAL '1' month;
ones. For example, 2023-07-05 is "less" than 2023-08-05.
Get today's date: Filtering events to those in the last 7 days year month count
SELECT CURRENT_DATE; Find sales made in July 2023: To find the deliveries scheduled for the last 7 days, use: 2022 8 51
In SQL Server: SELECT order_date, product_name, quantity SELECT delivery_date, address
SELECT GETDATE(); FROM sales 2022 9 58
FROM sales
WHERE order_date >= '2023-07-01' WHERE delivery_date <= CURRENT_DATE 2022 10 62
Get the timestamp with the current date and time: AND order_date < '2023-08-01'; AND delivery_date >= CURRENT_DATE 2022 11 76
SELECT CURRENT_TIMESTAMP; - INTERVAL '7' DAY; 2022 12 85
Find customers who registered in July 2023:
2023 1 71
CREATING DATE AND TIME VALUES SELECT registration_timestamp, email Note: In SQL Server, intervals are not implemented – use the
FROM customer 2023 2 69
To create a date, time, or timestamp, write the value as a string DATEADD() and DATEDIFF() functions.
and cast it to the proper type. WHERE registration_timestamp >= '2023-07-01'
SELECT CAST('2021-12-31' AS date); AND registration_timestamp < '2023-08-01';
SELECT CAST('15:31' AS time); Filtering events to those in the last 7 days Note that you must group by both the year and the month.
EXTRACT(MONTH FROM order_date) only extracts the
SELECT CAST('2021-12-31 23:59:29+02' Note: Pay attention to the end date in the query. The upper in SQL Server month number (1, 2, ..., 12). To distinguish between months from
AS timestamp); bound '2023-08-01' is not included in the range. The To find the sales made within the last 7 days, use:
different years, you must also group by year.
SELECT CAST('15:31.124769' AS time); timestamp '2023-08-01' is actually the timestamp '2023- SELECT delivery_date, address
08-01 00:00:00.0'. The comparison operator < is used to FROM sales
Be careful with the last example – it is interpreted as 15 minutes ensure the selection is made for all timestamps less than '2023- WHERE delivery_date <= GETDATE()
More about working with date and time values in our
31 seconds and 124769 microseconds! It is always a good idea to 08-01 00:00:00.0', that is, all timestamps in July 2023, AND delivery_date >=
interactive Standard SQL Functions course.
write 00 for hours explicitly: '00:15:31.124769'. even those close to the midnight of August 1, 2023. DATEADD(DAY, -7, GETDATE());

LearnSQL.com is owned by Vertabelo SA


LearnSQL.com – Practical SQL Courses for Teams and Individuals vertabelo.com | CC BY-NC-ND Vertabelo SA
SQL for Data Analysis Cheat Sheet
CASE WHEN GROUP BY EXTENSIONS COALESCE RANKING
CASE WHEN lets you pass conditions (as in the WHERE clause), GROUPING SETS COALESCE replaces the first NULL argument with a given value. Rank products by price:
evaluates them in order, then returns the value for the first It is often used to display labels with GROUP BY extensions. SELECT RANK() OVER(ORDER BY price), name
GROUPING SETS lets you specify multiple sets of columns to
condition met. SELECT region, FROM product;
group by in one query.
COALESCE(product, 'All'), RANKING FUNCTIONS
SELECT region, product, COUNT(order_id)
SELECT COUNT(order_id) RANK – gives the same rank for tied values, leaves gaps.
FROM sales
name, FROM sales DENSE_RANK – gives the same rank for tied values without gaps.
GROUP BY
CASE GROUP BY ROLLUP (region, product); ROW_NUMBER – gives consecutive numbers without gaps.
GROUPING SETS ((region, product), ());
WHEN price > 150 THEN 'Premium' region product count
WHEN price > 100 THEN 'Mid-range' region product count name rank dense_rank row_number
USA Laptop 10
ELSE 'Standard' USA Laptop 10 Jeans 1 1 1
END AS price_category USA Mouse 5
USA Mouse 5 GROUP BY (region, product) Leggings 2 2 2
FROM product; USA All 15
UK Laptop 6 Leggings 2 2 3
UK Laptop 6
Here, all products with prices above 150 get the Premium label, NULL NULL 21 GROUP BY () – all rows Sneakers 4 3 4
those with prices above 100 (and below 150) get the Mid-range UK All 6
Sneakers 4 3 5
label, and the rest receives the Standard label. CUBE All All 21
Sneakers 4 3 6
CUBE generates groupings for all possible subsets of the GROUP
CASE WHEN and GROUP BY BY columns. COMMON TABLE EXPRESSIONS T-Shirt 7 4 7
SELECT region, product, COUNT(order_id) A common table expression (CTE) is a named temporary result set
You may combine CASE WHEN and GROUP BY to compute
FROM sales that can be referenced within a larger query. They are especially RUNNING TOTAL
object statistics in the categories you define.
GROUP BY CUBE (region, product); useful for complex aggregations and for breaking down large A running total is the cumulative sum of a given value and all
SELECT
queries into more manageable parts. preceding values in a column.
CASE
region product count WITH total_product_sales AS ( SELECT date, amount,
WHEN price > 150 THEN 'Premium'
USA Laptop 10 SELECT product, SUM(profit) AS total_profit SUM(amount) OVER(ORDER BY date)
WHEN price > 100 THEN 'Mid-range'
FROM sales AS running_total
ELSE 'Standard' USA Mouse 5 GROUP BY region, product
GROUP BY product FROM sales;
END AS price_category,
UK Laptop 6 )
COUNT(*) AS products MOVING AVERAGE
FROM product USA NULL 15
GROUP BY region SELECT AVG(total_profit) A moving average (a.k.a. rolling average, running average) is a
GROUP BY price_category; UK NULL 6 technique for analyzing trends in time series data. It is the
FROM total_product_sales;
NULL Laptop 16 average of the current value and a specified number of preceding
Count the number of large orders for each customer using CASE GROUP BY product Check out our hands-on courses on Common Table values.
WHEN and SUM(): NULL Mouse 5 Expressions and GROUP BY Extensions. SELECT date, price,
SELECT NULL NULL 21 GROUP BY () – all rows AVG(price) OVER(
customer_id, WINDOW FUNCTIONS ORDER BY date
SUM( ROLLUP Window functions compute their results based on a sliding ROWS BETWEEN 2 PRECEDING
CASE WHEN quantity > 10 ROLLUP adds new levels of grouping for subtotals and grand window frame, a set of rows related to the current row. Unlike AND CURRENT ROW
THEN 1 ELSE 0 END totals. aggregate functions, window functions do not collapse rows. ) AS moving_averge
) AS large_orders SELECT region, product, COUNT(order_id) COMPUTING THE PERCENT OF TOTAL WITHIN A GROUP FROM stock_prices;
FROM sales FROM sales SELECT product, brand, profit,
GROUP BY customer_id; GROUP BY ROLLUP (region, product); (100.0 * profit /
DIFFERENCE BETWEEN TWO ROWS (DELTA)
SUM(profit) OVER(PARTITION BY brand) SELECT year, revenue,
... or using CASE WHEN and COUNT(): region product count LAG(revenue) OVER(ORDER BY year)
) AS perc
SELECT USA Laptop 10 FROM sales; AS revenue_prev_year,
customer_id, revenue -
USA Mouse 5 GROUP BY region, product product brand profit perc
COUNT( LAG(revenue) OVER(ORDER BY year)
CASE WHEN quantity > 10 UK Laptop 6 Knife Culina 1000 25 AS yoy_difference
THEN order_id END USA NULL 15 Pot Culina 3000 75 FROM yearly_metrics;
) AS large_orders GROUP BY region
UK NULL 6 Doll Toyze 2000 40 Learn about SQL window functions in our interactive Window
FROM sales
NULL NULL 21 GROUP BY () – all rows Car Toyze 3000 60 Functions course.
GROUP BY customer_id;

LearnSQL.com is owned by Vertabelo SA


Learn more in our interactive Creating Basic SQL Reports course vertabelo.com | CC BY-NC-ND Vertabelo SA

You might also like