Chapter 3
Chapter 3
INTRODUCTION TO SNOWFLAKE
Palak Raina
Senior Data Engineer
JOINS
INNER JOIN
OUTER JOINS
LEFT OUTER JOIN or LEFT JOIN
CROSS JOINS
SELF JOINS
NATURAL JOIN
LATERAL JOIN
INTRODUCTION TO SNOWFLAKE
Pizza dataset
INTRODUCTION TO SNOWFLAKE
Similarities with PostgreSQL - INNER JOIN
SELECT
pt.name AS pizza_type_name,
pt.category,
pt.ingredients,
p.size,
p.price
FROM
pizza_type AS pt
INNER JOIN -- Using INNER JOIN
pizzas AS p
ON
pt.pizza_type_id = p.pizza_type_id
INTRODUCTION TO SNOWFLAKE
NATURAL JOIN
Syntax:
SELECT ...
FROM <table_one> [
{
| NATURAL [ { LEFT | RIGHT | FULL } [ OUTER ] ]
}
]
JOIN <table_two>
[ ... ]
INTRODUCTION TO SNOWFLAKE
NATURAL JOIN
Without NATURAL JOIN With NATURAL JOIN
SELECT * SELECT *
FROM pizzas AS p FROM pizzas AS p
JOIN pizza_type AS t NATURAL JOIN pizza_type AS t
ON t.pizza_type_id = p.pizza_type_id
INTRODUCTION TO SNOWFLAKE
NATURAL JOIN
NOT ALLOWED
select *
FROM pizzas AS p
NATURAL JOIN pizza_type AS t
ON t.pizza_type_id = p.pizza_type_id
INTRODUCTION TO SNOWFLAKE
NATURAL JOIN
ALLOWED
WHERE clause
SELECT *
FROM pizzas AS p
NATURAL JOIN pizza_type AS t
WHERE pizza_type_id = 'bbq_ckn'
INTRODUCTION TO SNOWFLAKE
LATERAL JOIN
Syntax:
SELECT ...
FROM <left_hand_expression> , --
LATERAL
(<right_hand_expression>)
INTRODUCTION TO SNOWFLAKE
LATERAL JOIN with a subquery
SELECT
p.pizza_id,
lat.name,
lat.category
FROM pizzas AS p,
LATERAL -- Keyword LATERAL
( SELECT *
FROM pizza_type AS t
-- Referencing outer query column: p.pizza_type_id
WHERE p.pizza_type_id = t.pizza_type_id
) AS lat
INTRODUCTION TO SNOWFLAKE
Why LATERAL JOIN?
SELECT
*
FROM orders AS o,
LATERAL (
-- Subquery calculating total_spent
SELECT
SUM(p.price * od.quantity) AS total_spent
FROM order_details AS od
JOIN pizzas AS p
ON od.pizza_id = p.pizza_id
WHERE o.order_id = od.order_id
) AS t
ORDER BY o.order_id
INTRODUCTION TO SNOWFLAKE
Let's practice!
INTRODUCTION TO SNOWFLAKE
Subquerying and
Common Table
Expressions
INTRODUCTION TO SNOWFLAKE
Palak Raina
Senior Data Engineer
Subquerying
Nested queries
Used in FROM , WHERE , HAVING or SELECT clauses
Example:
SELECT column1
FROM table1
WHERE column1 = (SELECT column2 FROM table2 WHERE condition)
INTRODUCTION TO SNOWFLAKE
Correlated subquery
References columns from the outer query
SELECT pt.name,
pz.price,
pt.category
FROM pizzas AS pz
JOIN pizza_type AS pt
ON pz.pizza_type_id = pt.pizza_type_id
WHERE pz.price < (
-- Identifies highest price for each piza category
SELECT MAX(p2.price) -- Max price
FROM pizzas AS p2
WHERE -- Correlated: uses outer query column
p2.pizza_type_id = pz.pizza_type_id
)
INTRODUCTION TO SNOWFLAKE
Uncorrelated subquery
No reference to outer or main query
SELECT order_id
FROM order_details AS od
WHERE pizza_id = (
-- Uncorrelated: standalone subquery
-- Not referencing to outer query columns
SELECT pizza_id
FROM pizzas
ORDER BY price DESC
LIMIT 1
)
INTRODUCTION TO SNOWFLAKE
Correlated subquery limitations
Can't use LIMIT with corelated subquery
SELECT pt.name,
pz.price,
pt.category
FROM pizzas AS pz
JOIN pizza_type AS pt
ON pz.pizza_type_id = pt.pizza_type_id
WHERE pz.price < (
SELECT p2.price -- Get price
FROM pizzas AS p2
WHERE p2.pizza_type_id = pz.pizza_type_id -- Correlated to outer query
ORDER BY p2.price DESC -- Order with max price first
LIMIT 1 -- LIMIT 1 fetches only the top record, i.e., the max price
)
INTRODUCTION TO SNOWFLAKE
Correlated subquery limitations
Result:
INTRODUCTION TO SNOWFLAKE
Using LIMIT in uncorrelated subquery
SELECT pt.name,
Result:
pz.price,
pt.category
FROM pizzas AS pz
JOIN pizza_type AS pt
ON pz.pizza_type_id = pt.pizza_type_id
WHERE pz.price < (
SELECT p2.price
FROM pizzas AS p2 -- No where clause
ORDER BY p2.price DESC
-- Can use limit here
LIMIT 1
)
INTRODUCTION TO SNOWFLAKE
Common Table Expressions
Basic Syntax:
-- WITH keyword
WITH cte1 AS ( -- CTE name
SELECT col_1, col_2
FROM table1
)
...
SELECT ...
FROM cte1 -- Query CTE
;
INTRODUCTION TO SNOWFLAKE
Common Table Expressions
WITH max_price AS ( -- CTE:max_price
SELECT pizza_type_id,
MAX(price) AS max_price
FROM pizzas
GROUP BY pizza_type_id
)
-- Main query
SELECT pt.name,
pz.price,
pt.category
FROM pizzas AS pz
JOIN pizza_type AS pt ON pz.pizza_type_id = pt.pizza_type_id
JOIN max_price AS mp -- Joining with CTE:max_price
ON pt.pizza_type_id = mp.pizza_type_id
WHERE pz.price < mp.max_price -- Compare the price with max_price CTE column
INTRODUCTION TO SNOWFLAKE
Multiple CTEs
-- Define multiple CTEs separated by commas
WITH cte1 AS (
SELECT ...
FROM ...
),
cte2 AS (
SELECT ...
FROM ...
)
-- Main query combining both CTEs
SELECT ...
FROM cte1
JOIN cte2 ON ...
WHERE ...
INTRODUCTION TO SNOWFLAKE
Why Use CTEs?
Managing complex operations
Readable
Modular
Reusable
INTRODUCTION TO SNOWFLAKE
Let's practice!
INTRODUCTION TO SNOWFLAKE
Snowflake Query
Optimization
INTRODUCTION TO SNOWFLAKE
Palak Raina
Senior Data Engineer
What's Snowflake Query Optimization?
Transforming into more efficient queries
Snowflake's Cloud Services Layer
INTRODUCTION TO SNOWFLAKE
Why Optimize Queries in Snowflake?
Achieve faster results
Cost efficiency
Shorter query times consumes fewer resources like CPU and memory.
INTRODUCTION TO SNOWFLAKE
Common query problems
Exploding Joins: Be cautious!
Incorrect:
SELECT *
FROM order_details AS od
JOIN pizzas AS p -- Missing ON condition leading to exploding joins
INTRODUCTION TO SNOWFLAKE
Common query problems
Exploding Joins: Be cautious!
Correct:
SELECT *
FROM order_details AS od
JOIN pizzas AS p
ON od.pizza_id = p.pizza_id
INTRODUCTION TO SNOWFLAKE
Common query problems
UNION or UNION ALL : Know the difference.
UNION removes duplicates, slows down the query.
INTRODUCTION TO SNOWFLAKE
How to optimize queries?
SELECT * SELECT TOP 10*
SELECT SELECT
* TOP 10*
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
INTRODUCTION TO SNOWFLAKE
How to optimize queries?
SELECT * LIMIT
SELECT * SELECT *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
LIMIT 10
INTRODUCTION TO SNOWFLAKE
How to optimize queries?
SELECT * Avoid SELECT *
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
SELECT
o_orderdate,
o_orderstatus
FROM
SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
INTRODUCTION TO SNOWFLAKE
How to optimize queries?
Filter Early
Use WHERE Clause Early On.
INTRODUCTION TO SNOWFLAKE
Without early filtering
SELECT orders.order_id,
orders.order_date,
pizza_type.name,
pizzas.pizza_size
FROM orders
JOIN order_details
ON orders.order_id = order_details.order_id
JOIN pizzas
ON order_details.pizza_id = pizzas.pizza_id
JOIN pizza_type
ON pizzas.pizza_type_id = pizza_type.pizza_type_id
WHERE orders.order_date = '2015-01-01'; -- Filtering after JOIN
INTRODUCTION TO SNOWFLAKE
With early filtering
WITH filtered_orders AS (
SELECT *
FROM orders
WHERE order_date = '2015-01-01' -- Filtering in CTE before JOIN
)
SELECT filtered_orders.order_id,
filtered_orders.order_date,
pizza_type.name,
pizzas.pizza_size
FROM filtered_orders -- Joining with CTE
JOIN order_details
ON filtered_orders.order_id = order_details.order_id
JOIN pizzas
INTRODUCTION TO SNOWFLAKE
Query history
Query History
snowflake.account_usage.query_history
SELECT
query_text,
start_time,
end_time,
execution_time
FROM
snowflake.account_usage.query_history
WHERE query_text ilike '%order_details%'
INTRODUCTION TO SNOWFLAKE
Query history
Spot slow or frequently running queries
SELECT
query_text,
start_time,
end_time,
execution_time
FROM
snowflake.account_usage.query_history
WHERE
execution_time > 1000
INTRODUCTION TO SNOWFLAKE
Let's practice!
INTRODUCTION TO SNOWFLAKE
Handling semi-
structured data
INTRODUCTION TO SNOWFLAKE
Palak Raina
Senior Data Engineer
Structured versus semi-structured
Example of structured data Example of semi-structured data
INTRODUCTION TO SNOWFLAKE
Introducing JSON
JavaScript Object Notation
Common use cases: Web APIs, Mobile Apps, Config files.
INTRODUCTION TO SNOWFLAKE
JSON in Snowflake
Native JSON support
Flexible for evolving schemas
Comparisons:
INTRODUCTION TO SNOWFLAKE
How Snowflake stores JSON data
VARIANT supports OBJECT and ARRAY data types
OBJECT: { "key": "value"}
INTRODUCTION TO SNOWFLAKE
Semi-structured data functions
PARSE_JSON
expr : JSON data in string format.
INTRODUCTION TO SNOWFLAKE
PARSE_JSON
Example:
SELECT PARSE_JSON(
' -- enclosed in strings
{
"cust_id": 1,
"cust_name": "cust1",
"cust_age": 40,
"cust_email":"cust1***@gmail.com"
}
'-- enclosed in strings
) AS customer_info_json
INTRODUCTION TO SNOWFLAKE
OBJECT_CONSTRUCT
OBJECT_CONSTRUCT
Syntax: OBJECT_CONSTRUCT( [<key1>, <value1> [, <keyN>, <valueN> ...]] )
SELECT OBJECT_CONSTRUCT(
'cust_id', 1,
'cust_name', 'cust1',
'cust_age', 40,
'cust_email', 'cust1***@gmail.com'
)
INTRODUCTION TO SNOWFLAKE
Querying JSON data in Snowflake
Simple JSON Data
SELECT
customer_info:cust_age, -- using colon to access data from column
customer_info:cust_name,
customer_info:cust_email,
FROM
cust_info_json_data;
INTRODUCTION TO SNOWFLAKE
Querying nested JSON Data in Snowflake
Example of nested object
Colon: :
Dot: .
INTRODUCTION TO SNOWFLAKE
Querying nested JSON using colon/dot notations
Accessing values using colon notation Accessing values using dot notation
<column>:<level1_element>: <column>:<level1_element>.
<level2_element>:<level3_element> <level2_element>.<level3_element>
SELECT SELECT
customer_info:address:street AS street_name customer_info:address.street AS street_name
FROM FROM
cust_info_json_data cust_info_json_data
INTRODUCTION TO SNOWFLAKE
Let's practice!
INTRODUCTION TO SNOWFLAKE
Wrap-up
INTRODUCTION TO SNOWFLAKE
Palak Raina
Senior Data Engineer
Journey
INTRODUCTION TO SNOWFLAKE
Chapter 1: Architecture, Competitors, and
SnowflakeSQL
INTRODUCTION TO SNOWFLAKE
Chapter 1: Architecture, Competitors, and Snowflake
SQL
INTRODUCTION TO SNOWFLAKE
Chapter2: Snowflake SQL and key concepts
Connecting to Snowflake
WEB UI
INTRODUCTION TO SNOWFLAKE
Chapter2: Snowflake SQL and key concepts
Data Definition Language (DDL) Database structures and DML commands
CREATE SHOW
ALTER DESCRIBE
DROP INSERT
RENAME UPDATE
COMMENT MERGE
COPY
INTRODUCTION TO SNOWFLAKE
Chapter2: Snowflake SQL and key concepts
VARCHAR STRING functions: CONCAT , UPPER , LOWER
NUMERIC
DATE & TIME functions: CURRENT_DATE ,
INT CURRENT_TIME
DATE
EXTRACT functions: GROUP BY ALL
TIME
TIMESTAMP
INTRODUCTION TO SNOWFLAKE
Chapter 3: Advance Snowflake SQL Concepts
JOINS
NATURAL JOIN
LATERAL JOIN
Subquerying
CTEs
INTRODUCTION TO SNOWFLAKE
Chapter 3: Advance Snowflake SQL Concepts
Snowflake Query Optimization
PARSE_JSON , OBJECT_CONSTRUCT
INTRODUCTION TO SNOWFLAKE
Is this all?
Much more to unfold.
Not addressed
Setting context
Roles, Users
Window functions
Query profiling
Materialized Views
Clustering
...
INTRODUCTION TO SNOWFLAKE
Useful resources
Snowflake documentation: https://docs.snowflake.com/
Snowflake forums: https://community.snowflake.com/s/forum
INTRODUCTION TO SNOWFLAKE
This is just the
beginning!
INTRODUCTION TO SNOWFLAKE