The document provides an overview of common SQL commands and functions for analyzing data. It covers basic queries and selections, aggregation, grouping, joining, rounding, computations, and troubleshooting. Examples are given for each concept to demonstrate usage.
The document provides an overview of common SQL commands and functions for analyzing data. It covers basic queries and selections, aggregation, grouping, joining, rounding, computations, and troubleshooting. Examples are given for each concept to demonstrate usage.
SQL, or Structured Query Language, is a language for talking to PRODUCT Fetch product names sorted by the price column in the default JOIN is used to fetch data from multiple tables. To get the names databases. It lets you select specific data and build complex ASCending order: of products purchased in each order, use: name category reports. Today, SQL is a universal language of data, used in SELECT name SELECT practically all technologies that process data. Knife Kitchen FROM product orders.order_date, Pot Kitchen category count ORDER BY price [ASC]; product.name AS product, amount SELECT Mixer Kitchen Kitchen 3 FROM orders Jeans Clothing Clothing 3 Fetch product names sorted by the price column in JOIN product Fetch the id and name columns from the product table: Sneakers Clothing Electronics 2 DESCending order: ON product.id = orders.product_id; SELECT id, name SELECT name FROM product; Leggings Clothing FROM product Smart TV Electronics ORDER BY price DESC; Concatenate the name and the description to fetch the full Learn more about JOINs in our interactive SQL JOINs course. Laptop Electronics description of the products: SELECT name || ' - ' || description COMPUTATIONS FROM product; AGGREGATE FUNCTIONS Use +, -, *, / to do basic math. To get the number of seconds in a week: INSERT Count the number of products: SELECT 60 * 60 * 24 * 7; To insert data into a table, use the INSERT command: Fetch names of products with prices above 15: SELECT COUNT(*) -- result: 604800 INSERT INTO category SELECT name FROM product; VALUES FROM product (1, 'Home and Kitchen'), WHERE price > 15; (2, 'Clothing and Apparel'); ROUNDING NUMBERS Fetch names of products with prices between 50 and 150: Count the number of products with non-null prices: Round a number to its nearest integer: SELECT name SELECT COUNT(price) SELECT ROUND(1234.56789); You may specify the columns to which the data is added. The FROM product FROM product; -- result: 1235 remaining columns are filled with predefined default values or WHERE price BETWEEN 50 AND 150; NULLs. Round a number to two decimal places: INSERT INTO category (name) Count the number of unique category values: VALUES ('Electronics'); Fetch names of products that are not watches: SELECT ROUND(AVG(price), 2) SELECT COUNT(DISTINCT category) SELECT name FROM product FROM product; FROM product WHERE category_id = 21; WHERE name != 'watch'; -- result: 124.56 UPDATE Get the lowest and the highest product price: To update the data in a table, use the UPDATE command: Fetch names of products that start with a 'P' or end with an SELECT MIN(price), MAX(price) UPDATE category 's': FROM product; TROUBLESHOOTING SET SELECT name is_active = true, FROM product INTEGER DIVISION name = 'Office' WHERE name LIKE 'P%' OR name LIKE '%s'; In PostgreSQL and SQL Server, the / operator performs integer WHERE name = 'Ofice'; Find the total price of products for each category: division for integer arguments. If you do not see the number of SELECT category, SUM(price) decimal places you expect, it is because you are dividing between Fetch names of products that start with any letter followed by FROM product two integers. Cast one to decimal: 'rain' (like 'train' or 'grain'): SELECT name GROUP BY category; 123 / 2 -- result: 61 DELETE CAST(123 AS decimal) / 2 -- result: 61.5 To delete data from a table, use the DELETE command: FROM product DELETE FROM category WHERE name LIKE '_rain'; Find the average price of products for each category whose WHERE name IS NULL; average is above 3.0: DIVISION BY 0 Fetch names of products with non-null prices: SELECT category, AVG(price) To avoid this error, make sure the denominator is not 0. You may SELECT name FROM product use the NULLIF() function to replace 0 with a NULL, which Check out our interactive course How to INSERT, UPDATE, and FROM product GROUP BY category results in a NULL for the entire expression: DELETE Data in SQL. WHERE price IS NOT NULL; HAVING AVG(price) > 3.0; count / NULLIF(count_all, 0)
LearnSQL.com is owned by Vertabelo SA
Learn the basics of SQL in our interactive SQL Basics course. vertabelo.com | CC BY-NC-ND Vertabelo SA SQL for Data Analysis Cheat Sheet DATE AND TIME SORTING CHRONOLOGICALLY INTERVALS EXTRACTING PARTS OF DATES There are 3 main time-related types: date, time, and Using ORDER BY on date and time columns sorts rows An interval measures the difference between two points in time. The standard SQL syntax to get a part of a date is timestamp. Time is expressed using a 24-hour clock, and it can chronologically from the oldest to the most recent: For example, the interval between 2023-07-04 and 2023-07- SELECT EXTRACT(YEAR FROM order_date) be as vague as just hour and minutes (e.g., 15:30 – 3:30 p.m.) or SELECT order_date, product, quantity 06 is 2 days. FROM sales; as precise as microseconds and time zone (as shown below): FROM sales ORDER BY order_date; 2021-12-31 14:39:53.662522-05 To define an interval in SQL, use this syntax: You may extract the following fields: date time INTERVAL '1' DAY YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. timestamp order_date product quantity YYYY-mm-dd HH:MM:SS.ssssss±TZ 2023-07-22 Laptop 2 The syntax consists of three elements: the INTERVAL keyword, a quoted value, and a time part keyword. You may use the following The standard syntax does not work In SQL Server. 14:39:53.662522-05 is almost 2:40 p.m. CDT (e.g., in 2023-07-23 Mouse 3 Use the DATEPART(part, date) function instead. time parts: YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. Chicago; in UTC it'd be 7:40 p.m.). The letters in the above 2023-07-24 Sneakers 10 SELECT DATEPART(YEAR, order_date) example represent: FROM sales; 2023-07-24 Jeans 3 Adding intervals to date and time values In the date part: In the time part: 2023-07-25 Mixer 2 You may use + or - to add or subtract an interval to date or YYYY – the 4-digit HH – the zero-padded hour in a 24- year. hour clock. timestamp values. GROUPING BY YEAR AND MONTH mm – the zero-padded MM – the minutes. Use the DESCending order to sort from the most recent to the Find the count of sales by month: month (01—January SS – the seconds. Omissible. oldest: Subtract one year from 2023-07-05: SELECT through 12— ssssss – the smaller parts of a SELECT order_date, product, quantity SELECT CAST('2023-07-05' AS TIMESTAMP) EXTRACT(YEAR FROM order_date) AS year, December). second – they can be expressed FROM sales - INTERVAL '1' year; EXTRACT(MONTH FROM order_date) AS month, dd – the zero-padded using 1 to 6 digits. Omissible. ORDER BY order_date DESC; -- result: 2022-07-05 00:00:00 COUNT(*) AS count day. ±TZ – the timezone. It must start FROM sales with either + or -, and use two GROUP BY Find customers who placed the first order within a month from digits relative to UTC. Omissible. COMPARING DATE AND TIME the registration date: year, month SELECT id CURRENT DATE AND TIME VALUES FROM customers ORDER BY year Find out what time it is: You may use the comparison operators <, <=, >, >=, and = to WHERE first_order_date > month; SELECT CURRENT_TIME; compare date and time values. Earlier dates are less than later registration_date + INTERVAL '1' month; ones. For example, 2023-07-05 is "less" than 2023-08-05. Get today's date: Filtering events to those in the last 7 days year month count SELECT CURRENT_DATE; Find sales made in July 2023: To find the deliveries scheduled for the last 7 days, use: 2022 8 51 In SQL Server: SELECT order_date, product_name, quantity SELECT delivery_date, address SELECT GETDATE(); FROM sales 2022 9 58 FROM sales WHERE order_date >= '2023-07-01' WHERE delivery_date <= CURRENT_DATE 2022 10 62 Get the timestamp with the current date and time: AND order_date < '2023-08-01'; AND delivery_date >= CURRENT_DATE 2022 11 76 SELECT CURRENT_TIMESTAMP; - INTERVAL '7' DAY; 2022 12 85 Find customers who registered in July 2023: 2023 1 71 CREATING DATE AND TIME VALUES SELECT registration_timestamp, email Note: In SQL Server, intervals are not implemented – use the FROM customer 2023 2 69 To create a date, time, or timestamp, write the value as a string DATEADD() and DATEDIFF() functions. and cast it to the proper type. WHERE registration_timestamp >= '2023-07-01' SELECT CAST('2021-12-31' AS date); AND registration_timestamp < '2023-08-01'; SELECT CAST('15:31' AS time); Filtering events to those in the last 7 days Note that you must group by both the year and the month. EXTRACT(MONTH FROM order_date) only extracts the SELECT CAST('2021-12-31 23:59:29+02' Note: Pay attention to the end date in the query. The upper in SQL Server month number (1, 2, ..., 12). To distinguish between months from AS timestamp); bound '2023-08-01' is not included in the range. The To find the sales made within the last 7 days, use: different years, you must also group by year. SELECT CAST('15:31.124769' AS time); timestamp '2023-08-01' is actually the timestamp '2023- SELECT delivery_date, address 08-01 00:00:00.0'. The comparison operator < is used to FROM sales Be careful with the last example – it is interpreted as 15 minutes ensure the selection is made for all timestamps less than '2023- WHERE delivery_date <= GETDATE() More about working with date and time values in our 31 seconds and 124769 microseconds! It is always a good idea to 08-01 00:00:00.0', that is, all timestamps in July 2023, AND delivery_date >= interactive Standard SQL Functions course. write 00 for hours explicitly: '00:15:31.124769'. even those close to the midnight of August 1, 2023. DATEADD(DAY, -7, GETDATE());
LearnSQL.com is owned by Vertabelo SA
LearnSQL.com – Practical SQL Courses for Teams and Individuals vertabelo.com | CC BY-NC-ND Vertabelo SA SQL for Data Analysis Cheat Sheet CASE WHEN GROUP BY EXTENSIONS COALESCE RANKING CASE WHEN lets you pass conditions (as in the WHERE clause), GROUPING SETS COALESCE replaces the first NULL argument with a given value. Rank products by price: evaluates them in order, then returns the value for the first It is often used to display labels with GROUP BY extensions. SELECT RANK() OVER(ORDER BY price), name GROUPING SETS lets you specify multiple sets of columns to condition met. SELECT region, FROM product; group by in one query. COALESCE(product, 'All'), RANKING FUNCTIONS SELECT region, product, COUNT(order_id) SELECT COUNT(order_id) RANK – gives the same rank for tied values, leaves gaps. FROM sales name, FROM sales DENSE_RANK – gives the same rank for tied values without gaps. GROUP BY CASE GROUP BY ROLLUP (region, product); ROW_NUMBER – gives consecutive numbers without gaps. GROUPING SETS ((region, product), ()); WHEN price > 150 THEN 'Premium' region product count WHEN price > 100 THEN 'Mid-range' region product count name rank dense_rank row_number USA Laptop 10 ELSE 'Standard' USA Laptop 10 Jeans 1 1 1 END AS price_category USA Mouse 5 USA Mouse 5 GROUP BY (region, product) Leggings 2 2 2 FROM product; USA All 15 UK Laptop 6 Leggings 2 2 3 UK Laptop 6 Here, all products with prices above 150 get the Premium label, NULL NULL 21 GROUP BY () – all rows Sneakers 4 3 4 those with prices above 100 (and below 150) get the Mid-range UK All 6 Sneakers 4 3 5 label, and the rest receives the Standard label. CUBE All All 21 Sneakers 4 3 6 CUBE generates groupings for all possible subsets of the GROUP CASE WHEN and GROUP BY BY columns. COMMON TABLE EXPRESSIONS T-Shirt 7 4 7 SELECT region, product, COUNT(order_id) A common table expression (CTE) is a named temporary result set You may combine CASE WHEN and GROUP BY to compute FROM sales that can be referenced within a larger query. They are especially RUNNING TOTAL object statistics in the categories you define. GROUP BY CUBE (region, product); useful for complex aggregations and for breaking down large A running total is the cumulative sum of a given value and all SELECT queries into more manageable parts. preceding values in a column. CASE region product count WITH total_product_sales AS ( SELECT date, amount, WHEN price > 150 THEN 'Premium' USA Laptop 10 SELECT product, SUM(profit) AS total_profit SUM(amount) OVER(ORDER BY date) WHEN price > 100 THEN 'Mid-range' FROM sales AS running_total ELSE 'Standard' USA Mouse 5 GROUP BY region, product GROUP BY product FROM sales; END AS price_category, UK Laptop 6 ) COUNT(*) AS products MOVING AVERAGE FROM product USA NULL 15 GROUP BY region SELECT AVG(total_profit) A moving average (a.k.a. rolling average, running average) is a GROUP BY price_category; UK NULL 6 technique for analyzing trends in time series data. It is the FROM total_product_sales; NULL Laptop 16 average of the current value and a specified number of preceding Count the number of large orders for each customer using CASE GROUP BY product Check out our hands-on courses on Common Table values. WHEN and SUM(): NULL Mouse 5 Expressions and GROUP BY Extensions. SELECT date, price, SELECT NULL NULL 21 GROUP BY () – all rows AVG(price) OVER( customer_id, WINDOW FUNCTIONS ORDER BY date SUM( ROLLUP Window functions compute their results based on a sliding ROWS BETWEEN 2 PRECEDING CASE WHEN quantity > 10 ROLLUP adds new levels of grouping for subtotals and grand window frame, a set of rows related to the current row. Unlike AND CURRENT ROW THEN 1 ELSE 0 END totals. aggregate functions, window functions do not collapse rows. ) AS moving_averge ) AS large_orders SELECT region, product, COUNT(order_id) COMPUTING THE PERCENT OF TOTAL WITHIN A GROUP FROM stock_prices; FROM sales FROM sales SELECT product, brand, profit, GROUP BY customer_id; GROUP BY ROLLUP (region, product); (100.0 * profit / DIFFERENCE BETWEEN TWO ROWS (DELTA) SUM(profit) OVER(PARTITION BY brand) SELECT year, revenue, ... or using CASE WHEN and COUNT(): region product count LAG(revenue) OVER(ORDER BY year) ) AS perc SELECT USA Laptop 10 FROM sales; AS revenue_prev_year, customer_id, revenue - USA Mouse 5 GROUP BY region, product product brand profit perc COUNT( LAG(revenue) OVER(ORDER BY year) CASE WHEN quantity > 10 UK Laptop 6 Knife Culina 1000 25 AS yoy_difference THEN order_id END USA NULL 15 Pot Culina 3000 75 FROM yearly_metrics; ) AS large_orders GROUP BY region UK NULL 6 Doll Toyze 2000 40 Learn about SQL window functions in our interactive Window FROM sales NULL NULL 21 GROUP BY () – all rows Car Toyze 3000 60 Functions course. GROUP BY customer_id;
LearnSQL.com is owned by Vertabelo SA
Learn more in our interactive Creating Basic SQL Reports course vertabelo.com | CC BY-NC-ND Vertabelo SA