Advanced SQL Querying
Advanced SQL Querying
Advanced
Querying
With SQL Pocket Guide Author Alice Zhao
Quizzes & Assignments to test and reinforce key concepts, with step-by-step solutions
Interactive demos to keep you engaged and apply your skills throughout the course
THE You’ve just been hired as a Data Analyst Intern for Major League Baseball (MLB),
SITUATION who has recently gotten access to a large amount of historical player data
You have access to decades worth of data including player statistics like schools
THE attended, salaries, teams played for, height and weight, and more
ASSIGNMENT Your task is to use advanced SQL querying techniques to track how player statistics
have changed over time and across different teams in the league
We’ll cover general SQL syntax, but the demos will be in MySQL
• The SQL concepts taught in this course will apply to any relational database management system (Oracle,
PostgreSQL, SQL Server, SQLite, etc.), so you are welcome to use the SQL editor of your choice
In this section we’ll discuss where you can write SQL code, walk through the MySQL &
MySQL Workbench installation process and help you load the data for this course
Where to Write Installing MySQL & • Pick a place to write SQL code
SQL Code MySQL Workbench
• Install MySQL and MySQL Workbench (optional)
Getting Started with Loading Data For • Create a MySQL Connection to be able to start
MySQL Workbench This Course writing SQL code (optional)
• Load the tables for this course into your SQL editor
There are many options when it comes to picking a place to write SQL code:
If you’d like to see the exact same interface as the demos in this course, you’ll
need to download & install two programs:
Where to Write
SQL Code
1 RDBMS: MySQL 2 SQL Editor: MySQL Workbench
Installing MySQL &
MySQL Workbench
Getting Started
with MySQL
Where to Write
SQL Code
Getting Started
with MySQL
Where to Write 2 Select the following menu options and download the DMG Archive version
SQL Code
• Version: 9.1.0 Innovation (latest version)
• Operating System: macOS
Installing MySQL &
MySQL Workbench • OS Version: x86 or ARM (if you’re on an M1 / M2 / M3 Mac)
Getting Started
with MySQL 3 No need to Login or Sign Up, just click “No thanks, just start my download”
5 Click through each install step, leaving defaults unless you need customized settings
• NOTE: Make sure you store your root password somewhere, you’ll need this later!
Where to Write
SQL Code
Getting Started
with MySQL
Where to Write 2 Select the following menu options and download the DMG Archive version
SQL Code
• Operating System: macOS
• OS Version: x86 or ARM (if you’re on an M1 / M2 / M3 Mac)
Installing MySQL &
MySQL Workbench
3 No need to Login or Sign Up, just click “No thanks, just start my download”
Getting Started
with MySQL
4 Find the install file in your downloads, then double click to open the installer package
Loading Data for
This Course
5 Drag the MySQL Workbench icon into the Applications folder icon
Where to Write
SQL Code
Getting Started
with MySQL
2 Select the following menu options and download the MSI Installer version
Where to Write • Version: 9.1.0 Innovation (latest version)
SQL Code
• Operating System: Microsoft Windows
Getting Started
with MySQL 4 Find the install file in your downloads, then double click to open the installer package
Loading Data for 5 Click through each step, leaving defaults unless you need customized settings
This Course
• Choose Setup Type: Typical
• On the last step, make sure Run MySQL Configurator is checked and click Finish
6 In the MySQL Configurator pop up window, click through each step, leaving defaults
unless you need customized settings
• NOTE: Make sure you store your root password somewhere, you’ll need this later!
Where to Write
SQL Code
Getting Started
with MySQL
Where to Write
SQL Code 2 Select the following menu options and download the MSI Installer version
• Operating System: Microsoft Windows
Installing MySQL &
MySQL Workbench
3 No need to Login or Sign Up, just click “No thanks, just start my download”
Getting Started
with MySQL
4 Find the install file in your downloads, then double click to open the installer package
Loading Data for
This Course
5 Click through each step, leaving defaults unless you need customized settings
Where to Write
SQL Code
Getting Started
with MySQL
MySQL Workbench looks slightly different on Mac vs. PC, but everything you
need is found in the same place
Where to Write • While the course is recorded on a Mac, you should have no problem keeping up on a PC
SQL Code
Getting Started
with MySQL
You have two options when it comes to loading data for this course:
Getting Started
with MySQL
This code will work in any RDBMS
In this section we’ll quickly review the basics of SELECT statements so we’re on the
same page going into advanced querying concepts
Column(s) to display
The Big 6
In addition to the Big 6, there are common SQL keywords used in queries
These are popular keywords found in the SELECT clause:
The Big 6
In addition to the Big 6, there are common SQL keywords used in queries
These are popular keywords found in the WHERE clause:
The Big 6
Comparison operators include
=, !=, <>, <, <=, >, >=
Common SQL
Keywords
In addition to the Big 6, there are common SQL keywords used in queries
These are other popular keywords :
The Big 6
Common SQL
Keywords
DESC stands for “descending”, while
the default order is ASC (ascending)
In addition to the Big 6, there are common SQL keywords used in queries
These are other popular keywords :
The Big 6
Case statements use the following syntax to do IF-ELSE logic within SQL:
CASE WHEN … THEN … WHEN … THEN … ELSE … END
In this section we’ll talk about combining multiple tables in a single SQL query, such as
using JOINs to add new columns from related tables, and UNIONs to add new rows
Simple queries will return data from a single table, but in practice it’s helpful to
combine multiple tables to analyze data properly
Multi-Table
Analysis
JOIN Basics
JOIN Variations
UNION Basics
There are two ways to combine multiple tables into a single table for analysis:
• JOIN adds related columns from one table to another, based on common columns
Multi-Table
Analysis • UNION stacks the rows from multiple tables with the same column structure
JOIN Basics
JOIN Variations
UNION Basics
The rows from the two
tables with the same
The continent and
columns were stacked
population columns were
using a UNION
added using a JOIN,
based on the matching
country column
JOIN Basics
Left table Left table alias
JOIN Variations
Right table
Join condition
Column(s) in left Column(s) in right
table to join by table to join by
Multi-Table
Analysis Returns records that exist in BOTH tables, and
INNER excludes unmatched records from either table
These are the most common
JOIN Basics
Returns ALL records from the LEFT table, and any
LEFT matching records from the RIGHT table
JOIN Variations
This is less often used in
Returns ALL records from the RIGHT table, and
RIGHT any matching records from the LEFT table
practice; switch the tables
and use a LEFT JOIN instead
UNION Basics
Returns ALL records from BOTH tables, including
FULL OUTER non-matching records
While INNER and LEFT JOINs are supported in all RDBMS’s, RIGHT and FULL OUTER are not – for example,
SQLite does not support RIGHT JOINs, and MySQL and SQLite do not support FULL OUTER JOINs
Multi-Table
Analysis
JOIN Basics
UNION Basics
FULL OUTER n=7
RIGHT n=3
Results Preview
NEW MESSAGE
October 31, 2024
Hi there,
Solution Code
NEW MESSAGE
October 31, 2024
Hi there,
You can join tables on multiple columns by using “AND” in the join condition
Multi-Table
Analysis
JOIN Basics
JOIN Variations
UNION Basics
You can join tables on multiple columns by using “AND” in the join condition
Multi-Table
Analysis
JOIN Basics
Use table aliases as a best practice
when working with multiple tables
JOIN Variations
You can join more than two tables as long as you specify the columns that link
the tables together
Multi-Table
Analysis
JOIN Basics
JOIN Variations
UNION Basics
A self join lets you join a table with itself, and typically involves two steps:
1. Combine a table with itself based on a matching column
Multi-Table 2. Filter on the resulting rows based on some criteria
Analysis
JOIN Variations
UNION Basics
A self join lets you join a table with itself, and typically involves two steps:
1. Combine a table with itself based on a matching column
Multi-Table 2. Filter on the resulting rows based on some criteria
Analysis
JOIN Variations
UNION Basics
A self join lets you join a table with itself, and typically involves two steps:
1. Combine a table with itself based on a matching column
Multi-Table 2. Filter on the resulting rows based on some criteria
Analysis
JOIN Variations
UNION Basics
Results Preview
NEW MESSAGE
November 1, 2024
Hi again,
Solution Code
NEW MESSAGE
November 1, 2024
Hi again,
A cross join returns all combinations of rows within two or more tables
Multi-Table
Analysis
JOIN Basics
JOIN Variations
UNION Basics
PRO TIP: Cross joins can produce very large outputs, so be careful using this
on larger tables to avoid performance issues (in general, they are less common)
JOIN Basics
JOIN Variations
UNION Basics
JOIN Basics
JOIN Variations
UNION Basics
PRO TIP: If you know there are no duplicate values in the two tables
you’re combining, a UNION ALL will run much faster than a UNION
A JOIN combines data from two or more tables based on related column(s)
• Multiple JOINs can be written within the FROM clause of a single query aka SELECT statement
The main JOIN types are INNER, LEFT, RIGHT, and FULL OUTER
• INNER returns matches from both tables, LEFT includes everything from the left table, RIGHT includes
everything from the right table, and FULL OUTER returns all rows from both tables
Self joins and cross joins are additional JOIN options you can use
• Self joins are useful for side-by-side comparisons of rows within the same table
• CROSS JOINs return all combinations of rows within two or more tables, but are less commonly used
UNION and UNION ALL stack the results of two or more queries
• UNION removes duplicate rows and UNION ALL keeps them, making it the faster option of the two
In this section we’ll cover subqueries and common table expressions (CTEs), which
are different ways of working with nested queries
A subquery is a query nested within a main query, and is typically used for
solving a problem in multiple steps
Subqueries
EXAMPLE Return all countries that have an above average happiness score
CTEs
Step 1: Calculate the average happiness score Step 2: Return all rows with a happiness score greater than the first query result
CTEs
EXAMPLE Return the difference between each country’s happiness score and the average
Subqueries
CTEs
This subquery lets you subtract the
average happiness score from each row
Technique
Comparison
Results Preview
NEW MESSAGE
November 4, 2024
Hello,
Our product team plans on evaluating our product prices later
this week to see if any adjustments need to be made for next
year.
Can you give me a list of our products from most to least
expensive, along with how much each product differs from
the average unit price?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 4, 2024
Hello,
Our product team plans on evaluating our product prices later
this week to see if any adjustments need to be made for next
year.
Can you give me a list of our products from most to least
expensive, along with how much each product differs from
the average unit price?
Thanks!
Mandy
EXAMPLE Return each country’s happiness score for the year alongside the country’s average happiness score
CTEs
Queries can contain multiple subqueries as long as each one has a different alias
CTEs
Technique
Comparison
Queries can contain multiple subqueries as long as each one has a different alias
CTEs
Technique
Comparison
Results Preview
NEW MESSAGE
November 5, 2024
Hello,
Our inventory management team would like to review the
products produced by each factory.
Can you give me a list of our factories, along with the names
of the products they produce and the number of products
they produce?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 5, 2024
Hello,
Our inventory management team would like to review the
products produced by each factory.
Can you give me a list of our factories, along with the names
of the products they produce and the number of products
they produce?
Thanks!
Mandy
Subqueries
CTEs
Technique
This subquery filters the
Comparison
grouped regional data
Keywords like ANY, ALL, and EXISTS can provide more specific filtering logic
Subqueries
EXAMPLE Return happiness scores that are greater than ANY / ALL of the current happiness scores
CTEs
Technique
Comparison
Only 5 rows
are returned
Keywords like ANY, ALL, and EXISTS can provide more specific filtering logic
Subqueries
EXAMPLE Only return happiness scores for countries that EXIST in the inflation rates table
CTEs
Subqueries
EXAMPLE Only return happiness scores for countries that EXIST in the inflation rates table
Results Preview
NEW MESSAGE
November 6, 2024
Hello,
Our Wicked Choccy’s factory has some extra bandwidth, and
we’d like to see if there are any lower priced products that
they can help produce going forward.
Can you help us identify products that have a unit price less
than the unit price of all products from Wicked Choccy’s?
Please include which factory is currently producing them as
well.
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 6, 2024
Hello,
Our Wicked Choccy’s factory has some extra bandwidth, and
we’d like to see if there are any lower priced products that
they can help produce going forward.
Can you help us identify products that have a unit price less
than the unit price of all products from Wicked Choccy’s?
Please include which factory is currently producing them as
well.
Thanks!
Mandy
A common table expression (CTE) creates a named, temporary output that can
be referenced within another query
Subqueries EXAMPLE Return each country’s happiness score for the year alongside the country’s average happiness score
Technique
Comparison
A common table expression (CTE) creates a named, temporary output that can
be referenced within another query
Subqueries EXAMPLE Return each country’s happiness score for the year alongside the country’s average happiness score
A common table expression (CTE) creates a named, temporary output that can
be referenced within another query
Subqueries
EXAMPLE Return each country’s happiness score for the year alongside the country’s average happiness score
CTEs
Technique
Comparison
0
Same
outputs!
Subqueries
EXAMPLE For each country, return countries from the same region with a lower happiness score in 2023
CTEs
Technique
Comparison
Results Preview
NEW MESSAGE
November 7, 2024
Hello,
The sales director wants a list of our biggest orders. In
addition to sending over a list of all the orders over $200,
could you also tell him the number of orders over $200?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 7, 2024
Hello,
The sales director wants a list of our biggest orders. In
addition to sending over a list of all the orders over $200,
could you also tell him the number of orders over $200?
Thanks!
Mandy
You can use multiple CTEs in a query, and even combine them with subqueries
Subqueries
Technique
Comparison
You can use multiple CTEs in a query, and even combine them with subqueries
Subqueries
CTEs
Results Preview
NEW MESSAGE
November 8, 2024
Hi again –
Regarding my earlier message, could you rewrite your code
using CTEs instead of subqueries? Thanks!
---
Our inventory management team would like to review the
products produced by each factory.
Can you give me a list of our factories, along with the names of the
products they produce and the number of products they produce?
Solution Code
NEW MESSAGE
November 8, 2024
Hi again –
Regarding my earlier message, could you rewrite your code
using CTEs instead of subqueries? Thanks!
---
Our inventory management team would like to review the
products produced by each factory.
Can you give me a list of our factories, along with the names of the
products they produce and the number of products they produce?
A recursive CTE is a query that references itself, which is useful for generating
sequences and working with hierarchical data
Technique
Comparison UNION ALL
UNION or UNION ALL
are used as connectors
SELECT ... And a recursive member that
references the CTE
FROM cte_name
)
The syntax is slightly
different in each RDBMS
SELECT * FROM cte_name;
EXAMPLE Return daily stock prices, including dates with missing prices
Technique
Comparison This stops the recursion
after reaching this date
Notice the
missing dates
EXAMPLE Return daily stock prices, including dates with missing prices
Step 1: Generate a column of dates Step 2: Join with the stock prices table
Subqueries
CTEs
Technique
Comparison
Notice the
missing dates
Subqueries
This sets the highest-ranking
employee as the anchor
CTEs
Technique
Comparison
Subqueries
CTEs
Technique
Comparison
Subqueries
CTEs
Technique
Comparison
Subqueries
CTEs
Technique
Comparison
PRO TIP: Recursive
CTEs aren’t very common,
so instead of memorizing
syntax, keep in mind the
general concepts of
generating sequences and
returning hierarchies
Temporary tables and views are other options for querying the results of a query
• Both subqueries and CTEs only exist for the duration of the query
Subqueries
• Temporary tables exist for a session and views continue to exist until modified or dropped
CTEs
Technique
Comparison
Temporary
Multiple uses in a session If you find yourself
Temporary data storage Exists only More
Great for referencing referencing the same CTE,
Tables within a during a session permissions
multiple times in a session consider a temporary table
session
Temporary tables & views are other options for using the results of a query
• Subqueries and CTEs exists only for the duration of a query, and require minimal permissions to create
• Temporary tables exist for the duration of a session and views exist indefinitely, but they often require
additional permissions to create and maintain
In this section, we’ll break down each component of a window function, introduce
common window functions, and preview some of their applications
This is a window
This is a window
Window Functions
Applications
Preview
A calculation
is made for
each group A calculation is
applied to each
The original row window
granularity is kept
There is one row per country
Window Function
Basics
How each window should be sorted before
States that this is a the function is applied
window function (optional in MySQL, PostgreSQL, SQLite)
Window Functions (required in Oracle, SQL Server)
(required)
Applications
Preview
Applications
Preview
Window Functions
Applications
Preview
An ORDER BY allows you to specify the order of the rows within your windows
Window Function
Basics • You can order in ASC (default) or DESC order by one or more columns
Window Functions
Applications
Preview
An ORDER BY allows you to specify the order of the rows within your windows
Window Function
Basics • You can order in ASC (default) or DESC order by one or more columns
Window Functions
Applications
Preview
Results Preview
NEW MESSAGE
November 12, 2024
Hello,
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 12, 2024
Hello,
Thanks!
Mandy
There are many functions to choose from when writing window functions:
• LEAD
Value Relative to a Row
• LAG
• NTILE
Statistical Functions • CUME_DIST
• PERCENT_RANK
Window Functions
Applications
Preview
Results Preview
NEW MESSAGE
November 13, 2024
Hello,
Could you create a product rank field that returns a 1 for the
most popular product in an order, 2 for second most, and so
on? Please take a look at the results preview to get an idea of
what they’d like the ranking to look like.
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 13, 2024
Hello,
Could you create a product rank field that returns a 1 for the
most popular product in an order, 2 for second most, and so
on? Please take a look at the results preview to get an idea of
what they’d like the ranking to look like.
Thanks!
Mandy
There are three different ways of extracting a particular value within a window:
• FIRST_VALUE() extracts the first value in a window, in sequential row order
• LAST_VALUE() extracts the last value
Window Function
Basics • NTH_VALUE() extracts the value at a specified position
Window Functions
Applications
Preview
There are three different ways of extracting a particular value within a window:
• FIRST_VALUE() extracts the first value in a window, in sequential row order
• LAST_VALUE() extracts the last value
Window Function
Basics • NTH_VALUE() extracts the value at a specified position
Window Functions
Applications
Preview
Window Function
Basics
Window Functions
Extract the first value for each gender ordered
by number of babies in a subquery
Applications
Preview
Results Preview
NEW MESSAGE
November 14, 2024
Hello,
The sales team is going to try to see if they can bundle them
with some other products to increase units sold within each
order.
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 14, 2024
Hello,
The sales team is going to try to see if they can bundle them
with some other products to increase units sold within each
order. ORDER BY in subquery is not
needed and can be omitted
Thanks!
Mandy
LEAD() and LAG() allow you to return the value from the next and previous row,
respectively, within each window
Window Function
Basics
Window Functions
Applications
Preview
Window Function
Basics Return the prior year’s happiness
score for each country in a CTE
Window Functions
Results Preview
NEW MESSAGE
November 15, 2024
Hello,
We’d like to look into how orders have changed over time for
each customer.
Could you produce a table that contains info about each
customer and their orders, the number of units in each order,
and the change in units from order to order?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 15, 2024
Hello,
We’d like to look into how orders have changed over time for
each customer.
Could you produce a table that contains info about each
customer and their orders, the number of units in each orders,
and the change in units from order to order?
Thanks!
Mandy
Window Function EXAMPLE View the top 25% of happiness scores for each region
Basics
Window Functions
Applications
Preview
Because we specified NTILE(4), the range of
100% of the rows in each window is divided
into 4 groups of 25%, with 1 representing the
top percentile group, and 4 the bottom
NTILE is not supported in SQLite, but you can simulate it using other window functions
Window Function EXAMPLE View the top 25% of happiness scores for each region
Basics
Window Functions
Applications
Preview
Return the percentiles in a CTE
Then simply filter for the top percentile (25%)
Results Preview
NEW MESSAGE
November 18, 2024
Hello,
The customer engagement team would like to create a
rewards program for our top 1% of customers.
Could you pull a list of the top 1% of customers in terms of
how much they’ve spent with us?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 18, 2024
Hello,
ORDER BY in CTE is not
The customer engagement team would like to create a needed and can be omitted
rewards program for our top 1% of customers.
Could you pull a list of the top 1% of customers in terms of
how much they’ve spent with us?
Thanks!
Mandy
ƒₓ In this section, we’ll go over commonly used functions by data type, including numeric,
datetime, string functions, and more
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
PRO TIP: While SQL is case insensitive, it’s a best practice to capitalize functions so they stand out,
similar to clauses. If you’re using a SQL editor, they will automatically be highlighted a different color.
Function Basics
Aggregate
Numeric • Applies a calculation to many rows
Functions of data and returns a single value
• Often used alongside a GROUP BY
Datetime
Functions
Window
String Functions • Performs calculations across a
window of rows
NULL Functions
General
• Performs calculations on all
individual values within a column
Function Basics
Aggregate
Numeric • Applies a calculation to many rows
Functions of data and returns a single value
• Often used alongside a GROUP BY
Datetime
Functions
Window
String Functions • Performs calculations across a
window of rows
NULL Functions
General
• Performs calculations on all
individual values within a column
Function Basics
Aggregate
Numeric • Applies a calculation to many rows
Functions of data and returns a single value
• Often used alongside a GROUP BY
Datetime
Functions
Window
String Functions • Performs calculations across a
window of rows
NULL Functions
General
• Performs calculations on all
individual values within a column
Function Basics
Category Function Description
Function Basics
EXAMPLE Applying a log transform to the population of each country
Numeric
Functions
This is a nested function
Datetime
Functions
String Functions
NULL Functions
Numeric
Functions
Datetime
Functions
We’ll discuss how to handle NULL
values like these later in the section
String Functions
NULL Functions
Numeric
Functions
Dividing by 1.0 turns an integer
into a float (includes decimals)
Datetime
Functions
String Functions
NULL Functions
The CAST / CONVERT functions only change
the data type for the duration of the query,
In some RDBMS’s like SQL Server, not permanently
use CONVERT instead of CAST
Results Preview
NEW MESSAGE
November 19, 2024
Hello,
Our market research team is interested in seeing how many
customers have spent $0-$10 on our products, $10-$20, and
so on for every $10 range.
Could you generate this table for them?
Thanks!
Mandy
Solution Code
NEW MESSAGE
November 19, 2024
Hello,
Our market research team is interested in seeing how many
customers have spent $0-$10 on our products, $10-$20, and
so on for every $10 range.
Could you generate this table for them?
Thanks!
Mandy
Function Basics
Category Function Description
CURRENT_DATE Current date
Numeric
Current
Functions
CURRENT_TIMESTAMP Current date and time
Datetime YEAR, MONTH, etc. Extract a specific portion of the datetime value
Functions Extract
DAYOFWEEK Extract the day of the week from the datetime value
NULL Functions
Datetime and string functions vary widely by RDMBS. The functions in this section are
for MySQL, but you may need to look up the specific function in your RDBMS
Function Basics
Numeric
Functions
Datetime
Functions
NULL Functions
Function Basics
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
Results Preview
NEW MESSAGE
November 20, 2024
Hello,
The market research team wants to do a deep dive on the Q2
2024 orders data we currently have.
Can you pull that data for them?
In addition, they also requested that we include a ship_date
column for them that’s 2 days after the order_date.
Thanks for your help!
Mandy
Solution Code
NEW MESSAGE
November 20, 2024
Hello,
The market research team wants to do a deep dive on the Q2
2024 orders data we currently have.
Can you pull that data for them?
In addition, they also requested that we include a ship_date
column for them that’s 2 days after the order_date.
Thanks for your help!
Mandy
Function Basics
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
Function Basics
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
Results Preview
NEW MESSAGE
November 21, 2024
Hello,
We’re updating our product_ids to include the factory name
and product name.
Here’s what we’re thinking it should look like.
Could you write the SQL code to produce this?
Thank you!
Mandy
Solution Code
NEW MESSAGE
November 21, 2024
Function Basics
Numeric
Functions
Datetime Since this extracts the text before the first space, it
Functions doesn’t work for strings with a single word
String Functions
NULL Functions
Function Basics
Datetime
Functions
String Functions
Function Basics
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
Function Basics
Numeric
Functions
Datetime
Functions
String Functions
NULL Functions
Regular expressions allow you to find patterns in your text – they are language agnostic, which means they work in
multiple languages like SQL and Python (they take extra processing power, so they aren’t recommended for large datasets)
Results Preview
NEW MESSAGE
November 22, 2024
Hi again,
The marketing team has kicked off an initiative to simplify our
product names, starting with our Wonka Bars products.
Could you remove “Wonka Bar” from any products that
contain the term?
Thank you!
Mandy
Solution Code
NEW MESSAGE
November 22, 2024
Hi again,
The marketing team has kicked off an initiative to simplify our
product names, starting with our Wonka Bars products.
Could you remove “Wonka Bar” from any products that
contain the term?
Thank you!
Mandy
Function Basics
To do a simple IF-ELSE NULL check:
Numeric • Use the IFNULL function (NVL in Oracle, not supported in PostgreSQL)
Functions
Datetime
Functions To do more complex IF-ELSE NULL checks:
• Use the COALESCE function (supported in most modern RDBMS’s)
String Functions
NULL Functions
PRO TIP: COALESCE is a more flexible version of the two, and will allow
you to do multiple NULL checks, and returns the first non-NULL value
Results Preview
NEW MESSAGE
November 25, 2024
Hello,
Sugar Shack and The Other Factory just added two new
products that don’t have divisions assigned to them.
For simplicity's sake, could you update those NULL values to
have a value of “Other”?
Here’s an extra challenge for you – instead of updating them to
“Other”, could you update them to be the same division as the
most common division within their respective factories?
Thanks!
Solution Code
NEW MESSAGE
November 25, 2024
Now that we’ve introduced a variety of advanced SQL concepts, we’ll use this section to go
over common data analysis applications that utilize the techniques we’ve learned
Duplicate Values
Pivoting
These are fully duplicate rows
Rolling
Calculations These could potentially be:
1. Two employees with the same
name, in different regions
Imputing NULL 2. The same employee who
Values transferred regions This employee seems to have gotten a raise
Duplicate Values
• Use a combination of GROUP BY, COUNT, and HAVING
Imputing NULL
Values
Results Preview
NEW MESSAGE
December 2, 2024
Good morning!
We’ve learned that there’s a student who’s showing up
multiple times in our student records.
Can you generate a report of the students and their emails,
and exclude the duplicate student record?
Thank you!
Stu
Solution Code
NEW MESSAGE
December 2, 2024
Good morning!
We’ve learned that there’s a student who’s showing up
multiple times in our student records.
Can you generate a report of the students and their emails,
and exclude the duplicate student record?
Thank you!
Stu
Min / max value filtering allows you to filter data based on the lowest or highest
values within each group
Duplicate Values
EXAMPLE Return the most recent sales amount for each sales rep
Pivoting
Rolling
Calculations Only the date is returned, but we
want the sales amount as well
Imputing NULL
Values
There are two approaches you can take to include the sales amount:
• Use a GROUP BY with a JOIN
• Use a window function
Results Preview
NEW MESSAGE
December 3, 2024
Hi again,
Can you create a report of each student with their highest
grade for the semester, as well as which class it was in?
Thanks!
Stu
Solution Code
NEW MESSAGE
December 3, 2024
Hi again,
Can you create a report of each student with their highest ORDER BY in CTE is not needed and can be omitted
grade for the semester, as well as which class it was in?
Thanks!
Stu
Pivoting lets you transform rows into columns to summarize your data
• This can be achieved using CASE statements
Duplicate Values
• PIVOT is available in some RDBMS’s like SQL Server and Oracle
Pivoting
Rolling
Calculations
Imputing NULL
Values
Results Preview
NEW MESSAGE
December 4, 2024
Hello,
Can you help us create a summary table that shows the
average grade for each department and grade level?
Thanks for all your help this week!
Stu
Solution Code
NEW MESSAGE
December 4, 2024
Hello,
Can you help us create a summary table that shows the
average grade for each department and grade level?
Thanks for all your help this week!
Stu
Pivoting
Rolling
Calculations
Imputing NULL
Values
Pivoting
Rolling
Calculations
Use SUM() as a window function to
Imputing NULL calculate the cumulative sum
Values
Pivoting
Rolling
Calculations
Imputing NULL
Values
Results Preview
NEW MESSAGE
December 5, 2024
Hi again,
Can you help us generate a report that shows the total sales
for each month, as well as the cumulative sum of sales and the
six-month moving average of sales?
Solution Code
NEW MESSAGE
December 5, 2024
Hi again,
Can you help us generate a report that shows the total sales
for each month, as well as the cumulative sum of sales and the
six-month moving average of sales?
Imputing values means replacing NULL values in the data with other values
Duplicate Values
We’ll cover four different approaches on how to do this in SQL:
Min / Max Value
Filtering 1. Hard coded value (integer)
2. Average of a column (subquery)
Pivoting
3. Prior row’s value (window function)
Rolling
Calculations
4. Smoothed value (two window functions)
Imputing NULL
Values
In this final demo, we’ll be writing a single query that contains techniques learned in every section of this course! It
includes a JOIN, UNION, subquery, recursive CTE, window function, numeric function and NULL function.
Min / max value filtering allows you to filter data within each group
• This can be accomplished with a combination of GROUP BY and JOIN, or with a window function
There are many options for imputing NULL values, or filling in missing data
• Options include using hard coded values, column aggregations, relative row values and more
THE You’ve just been hired as a Data Analyst Intern for Major League Baseball (MLB),
SITUATION who has recently gotten access to a large amount of historical player data
You have access to decades worth of data including player statistics like schools
THE attended, salaries, teams played for, height and weight, and more
ASSIGNMENT Your task is to use advanced SQL querying techniques to track how player statistics
have changed over time and across different teams in the league