0% found this document useful (0 votes)

13 views

Query Optimization

Uploaded by

Taxi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Query Optimization

Uploaded by

Taxi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

SNOWFLAKE QUERY OPTIMIZATION

Imagine you're a chef in a big kitchen preparing a delicious meal. You have a
recipe that involves using different ingredients and cooking techniques. Now,
let's say you have a list of tasks to complete to make the meal: chopping
vegetables, marinating meat, boiling pasta, and so on.

In the world of databases, like Snowflake, when you run a query (which is like
asking the database a question), it's like giving the kitchen chef a set of cooking
instructions. However, just like the chef can find more efficient ways to
complete cooking tasks to save time and effort, databases can also find smarter
ways to execute queries.

Query optimization in Snowflake is like the chef trying to cook the meal in the
fastest and most efficient way. The database looks at your query and figures
out the best path to retrieve and combine the data you want. It considers
things like which parts of the data to fetch first, how to sort and filter them,
and whether it can use special techniques to make things quicker.

For instance, if your query involves looking at data from different tables, the
database might choose to combine the tables in a particular way to avoid
unnecessary work. It's like the chef deciding to chop all the vegetables at once
instead of doing it separately for each dish.

The goal of query optimization is to minimize the time it takes for your query to
give you the results you want. Just like the chef aiming to serve a delicious
meal quickly, Snowflake's query optimization helps you get your data faster and
more efficiently, making your work smoother and more enjoyable
Query optimization is a process of defining the most efficient and optimal
way and techniques that can be used to improve query performance based
on the rational use of system resources and performance metrics.

The purpose of query tuning is to find a way to decrease the response time of
the query, prevent the excessive consumption of resources, and identify poor
query performance.
WHY QUERY OPTIMIZATION ???

Query Optimization is crucial in Snowflake (and any database system) for

several important reasons:

Performance Improvement: Query optimization in Snowflake is all about

making your queries run faster. When you're working with big datasets,
complicated combinations of data, and lots of calculations, poorly written
queries can be really slow. But when you optimize your queries, they speed up.
This means you'll get the answers you're looking for super quickly, which makes
your data analysis work much smoother and faster

Cost Efficiency: Snowflake, like many cloud-based data warehouses, charges

based on the amount of data processed. Poorly optimized queries can consume
more resources and therefore lead to higher costs. By optimizing your queries,
you can reduce the amount of data that needs to be processed, ultimately
saving on your operational expenses.

Reduce query run time: Some queries may have a certain SLA that they need
to meet, not hitting those targets may result in delays in the delivery of data
that could negatively impact the business. Query optimization can help
improve the performance of queries by finding the optimal way to access and
manipulate the data.

User Experience: If you're sharing data insights or reports with others, the
speed at which they receive the results matters. Optimized queries mean
quicker response times for your end-users, leading to a better overall
experience and higher user satisfaction.
Scalability: As your data grows, the performance of unoptimized queries can
degrade significantly. Query optimization ensures that your queries continue to
run efficiently as your dataset expands, maintaining consistent performance
levels.

Complex Analytics: Data analysts often deal with complex queries involving
joins, aggregations, and filtering. Optimizing these intricate queries becomes
essential to ensure that the results are accurate and generated in a timely
manner.

Resource Utilization: Snowflake, being a cloud-based data warehouse,

allocates resources dynamically to handle queries. Well-optimized queries
utilize resources effectively, preventing unnecessary overloads and ensuring
that the system remains responsive for all users.

Decision-Making: Timely access to accurate data is crucial for making informed

decisions. Slow queries can delay decision-making processes, potentially
impacting business operations. Optimized queries provide quick access to the
required data for confident decision-making.
ARCHITECTURE OF SNOWFLAKE

Now before going deep into the technique to optimize query first understand
the architecture of snowflake.

Snowflake’s architecture is a hybrid of traditional shared-disk and shared-

nothing database architectures. Similar to shared-disk architectures, Snowflake
uses a central data repository for persisted data that is accessible from all
compute nodes in the platform. But similar to shared-nothing architectures,
Snowflake processes queries using MPP (massively parallel processing)
compute clusters where each node in the cluster stores a portion of the entire
data set locally. This approach offers the data management simplicity of a
shared-disk architecture, but with the performance and scale-out benefits of a
shared-nothing architecture.

In simple words:

 Snowflake is like a mix of a big shared storage room (shared-disk) and

lots of individual workers with their own tools (shared-nothing).
 It keeps all the data in one main place that everyone can use easily, just
like a library.
 But when you need to do things with the data, it spreads the work
among many helpers, each with their own piece of data. This makes
things quicker and more efficient, like each helper has their own tools
handy.
 This combination makes Snowflake great at handling data in a simple
way while also being really fast and good at handling lots of work at the
same time.
It Consists of three layers:

 Database Storage
 Query Processing
 Cloud Services
Database Storage Layer:
This is where the raw data is stored.
Snowflake stores data in a structured and compressed format, using a
combination of columnar storage and a unique indexing scheme that improves
query performance.
The data is divided into micro-partitions, which are small, immutable units of
data that allow for efficient storage and querying.

Query Processing / Compute Layer:

The compute layer is responsible for processing and querying the data stored in
the data storage layer.
Snowflake employs a virtual warehouse concept, which are clusters of
compute resources that can be scaled up or down as needed.
This decoupling of storage and compute allows for better resource utilization
and the ability to independently scale processing power and storage capacity.

Cloud Services Layer:

This layer includes various services that manage the metadata, access control,
and query optimization.
The services layer handles tasks such as query parsing, optimization, and
execution.
It also manages user authentication, data access permissions, and other
administrative functions.
Snowflake's services layer plays a critical role in ensuring the security,
performance, and manageability of the system.
QUERY OPTIMIZATION TECHNIQUES IN SNOWFLAKE

1. SNOWFLAKE SERACH OPTIMIZATION SERVICE (S0S)

Search Optimization Service aims to enhance the performance of queries under

specific conditions, such as when the table is frequently queried on columns
other than the primary cluster key, and when analytical queries involve
extensive filtering predicates.

Frequently Queried Coloumn : The columns being frequently queried are not
the primary cluster key columns. In a database, the primary cluster key (often
referred to as the primary key) is used for indexing and organizing data. When
queries predominantly use other columns for filtering and searching, it can
impact the efficiency of Query Execution.

Analytical Queries with Extensive Predicates: Analytical queries involve

aggregations, complex calculations, and often require filtering based on various
conditions. Extensive predicates refer to a large number of filtering conditions
applied to these analytical queries.

Both of this are taken care by SOS.

REAL TIME USE CASE OF SEARCH OPTIMIZATION SERVICE

 Snowflake supports semi-structured data types like VARIANT and

OBJECT, which can store JSON, XML, or other unstructured data. You
can optimize queries that involve searching and filtering within these
semi-structured fields. This could be useful for analyzing log data,
event records, or any data containing free-form text.

 Business users need fast response times for critical dashboards with
highly selective filters.

 Data scientists who are exploring large data volumes and looking for
specific subsets of data.

 Substring and regular expression searches.

 When Snowflake instance is connected to real-time data sources,

optimizing queries to search and retrieve recent data updates quickly
is important for up-to-date analytics.

NOTE:

search optimization is most effective for queries that involve

searching for text within a large number of rows. If your
queries only involve a small number of rows, or if the columns
being searched are not text-based, search optimization may
not provide significant performance benefits.
HOW TO ENABLE SEARCH OPTIMIZATION??

To enable search optimization for your account you must enable this
feature for specific columns or fields in columns or for the entire table

For example:

# Adding search optimization for entire table.

alter table <table_name>
add search optimization;

Here, I have a table “demodatabase.db_source.script_src_date_day_lookup”,

will apply search optimization to it.

alter table demodatabase.db_source.script_src_date_day_lookup

add search optimization;
2. Minimize Data Movement

Minimizing data movement is a fundamental principle in optimizing query

performance in Snowflake and other data warehousing systems. It involves
reducing the amount of data that needs to be transferred and processed during
query execution. This is crucial for improving query speed, resource utilization,
and overall efficiency.

HOW TO ACHIEVE THIS ??

 Using partition pruning:

Partition pruning is a technique used in Snowflake to improve query
performance by reducing the amount of data that needs to be scanned
when querying large tables that are partitioned.
It involves dividing a table into smaller, more manageable parts called
partitions, based on a specific column or set of columns.

When a query includes a WHERE clause that filters on a partition key

column, Snowflake can use partition pruning to eliminate partitions that
do not contain any relevant data, reducing the amount of data that needs
to be scanned during query execution. This results in faster query
processing and better resource utilization

suppose you have a large table of sales data that is partitioned by date,
with each partition containing sales data for a specific date range. If you
run a query that filters on a specific date range, Snowflake can use
partition pruning to eliminate all the partitions that do not contain any
sales data for that date range, reducing the amount of data that needs to
be scanned.
HOW TO IMPLEMENT PARTITION PRUNING??

For example, if you have a partitioned table on the “date” column, you
can use the following query to only scan the partitions for the month
of January:
SELECT *

FROM <TABLE_NAME>

WHERE year >= ' 2020-01-01 ' AND date < ' 2020-01-31 ';

 Using appropriate sorting:

Sorting queries are a common type of database query where the

results are arranged in a specific order based on one or more columns.

When query requires sorting, you can use the ORDER BY clause to specify
the sort order. This can help Snowflake minimize data movement by
sorting the data on each node before sending it to the query execution
node.

HOW TO IMPLEMENT SORTING ??

For example, if you need to sort a table by the “name” column, you can
use the following query:

SELECT *
FROM <TABLE_NAME>
ORDER BY name;
3. USE APPROPRIATE DATA TYPES

The right data type for your database columns is a critical decision that
affects storage efficiency, query performance, and data integrity. Each
database system provides a range of data types designed to
accommodate different types of data and optimize storage and
processing.

4. USE MATERIALIZED VIEW

A materialized view in a database is a pre-computed and pre-stored result

set that represents the outcome of a specific query, often involving
aggregations, calculations, or joins. Materialized views are used to
enhance query performance by reducing the need to perform complex
calculations or aggregations every time a query is executed.

This can be achieved by using appropriate aggregation levels:

For example, if you have a sales table with millions of rows, you can
create a materialized view that aggregates the data at the monthly level.

CREATE MATERIALIZED VIEW monthly AS

SELECT DATE_TRUNC('MONTH', date) AS month, SUM(sales_amount) AS
total_sales
FROM sales_table
GROUP BY 1;
5. USE CLUSTERING KEYS

Clustering keys in a database determine the physical organization of data

on disk. When a table is defined with a clustering key, the data is
organized and stored on disk based on the values in the clustering key
column(s). This physical organization affects how the data is accessed and
retrieved during query execution.

 Choosing the appropriate clustering key: The clustering key

should be chosen based on the most frequently used filter
conditions in your queries. For example, if you frequently query a
sales table by date range, you can use the date column as a
clustering key.

 Using multi-column clustering keys: If your queries frequently

filter on multiple columns, you can use a multi-column clustering
key to group the data together more efficiently.

HOW TO IMPLEMENT THIS??

For example, if you frequently query a sales table by date range and product
category, you can use the following clustering key:

#CREATING A TABLE
CREATE TABLE sales_table (

date DATE,

product_category VARCHAR,

sales_amount FLOAT

#CLUSTERING TWO TABLE.

CLUSTER BY (date, product_category);

6. MAXIMIZE CACHE USAGE

Maximizing cache usage is a crucial strategy for optimizing query

performance in Snowflake. The Snowflake cache is a feature that involves
storing frequently accessed data in memory, allowing for faster retrieval
compared to accessing data from disk. By taking advantage of caching,
you can significantly improve query execution times and enhance overall
system performance.

HOW TO IMPLEMENT CACHING??

ALTER SESSION SET USE_CACHED_RESULT = TRUE

7. OPTIMIZE BULK DATA LOADING INTO TABLES USING COPY
COMMANDS

Instead of doing Bulk Data Load try using COPY command.

By using the COPY command, you can load large datasets quickly and
efficiently into Snowflake tables, enabling faster data processing and
analysis.

8. USE SEPARATE WAREHOUSE FOR QUERIES, TASK LOADING AND

LARGE FILES.

From this fig you can see that Finance, Data Scientist, Data Loading,
and Marketing have separate warehouses which will help them with
fast and responsive queries which will ultimately improve
performance, scalability, and cost efficiency.
9. USE LIMIT ROWS OR TOP CLAUSE

Using a TOP or LIMIT clause can indeed improve the performance of

SELECT queries, especially when dealing with large datasets. By limiting
the number of rows returned by the query, you can reduce the amount
of data that needs to be processed and transmitted, leading to faster
query execution times and improved response times. This is particularly
useful when you're interested in only a subset of the data for analysis or
preview purposes.

The LIMIT or TOP clause also improves queries with an ORDER BY clause.
For example, the first query below returns in just over two minutes on an
X-SMALL warehouse, whereas the second query, which returns the top
ten entries, is twelve times faster, taking just six seconds to complete.

This command takes 2m 5s for its execution.

select *
from SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
order by o_totalprice desc;

Whereas when using top its execution time decreases to 6s.

Same thing goes when using limit too.

select top 10 *
from SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.ORDERS
order by o_totalprice desc;
10.BREAKING-DOWN COMPLEX JOIN OPERATION

JOIN operations enable you to combine data from multiple related tables
into a single result set. Instead of retrieving data from each table
separately and then manually combining them, JOINs allow you to get
the desired data in a single query.
This significantly reduces the amount of data transferred over the
network and minimizes the need for additional processing on the client
side.
Making querying more reliable.

But what if we have complex join operation??

Refer the below figure:

Here we can see that the process is too complex. And it will directly
impact on query optimization.

So instead of proceeding with this approach, we should break-down the

process and write the data in one table.
Breaking down to complex query into simple one.

With the help of this approach, instead of executing the

statement as a single query, break the problem down into
multiple queries than write results to transient tables.

It will eliminate the high execution time.

11.USE QUERY PROFILING

Query profiling, also known as query performance profiling, is a process of

analyzing and understanding the execution of a database query in order to
identify performance bottlenecks, optimize query execution times, and
improve overall database performance.
It involves capturing detailed information about how a query is processed by
the database management system, including the resources consumed and the
steps taken during execution.
Query profiling helps database administrators, developers, and analysts
identify areas where queries can be optimized for better performance.

HOW TO DO QUERY PROFILING???

Once you run a query click the query profile view and you should identify
the following red flags:

1. Analyze the query profile to identify the slowest stages of the query
plan. Look for stages with high “elapsed time” or “execution time”
values. These stages are likely to be the bottlenecks in your query.

2. Once you have identified the slowest stages, you can try to optimize
them. Here are some tips:
o If a stage involves a large number of rows, consider adding a
filter to reduce the number of rows processed.
o If a stage involves a large amount of data movement, consider
using a more efficient join strategy or partitioning the data.
o If a stage involves a complex computation, consider simplifying
the computation or using a more efficient algorithm.
o Look for long-running or high-cost operations in the query
plan. These are likely areas where the query is spending a lot of
time or scanning a large amount of data.
o Examine the number of rows processed by each step in the
query plan. This can help you identify areas where the query is
performing unnecessary processing.
o Look for opportunities to minimize data movement by using
appropriate sorting and partitioning.
o Review the query code to identify areas for optimization, such
as using appropriate data types, and minimizing the use of
subqueries and joins.

3. After you have made changes to your query, rerun it to see if the
changes have improved the query performance.
12. AVOID USING SELECT *

Avoid using select * when you just require data from few
coloumn.
Select * will unusually execute over the entire table.
So when we just require the data from few column, better
I nstead of doing select * , specify coloumn name.

Example : Need sales_profit year wise from sales table.

While using below code,whole sales table will be considered.

Select * from sales;

While using this code only the specified coloumn will be taken
care. It is more optimized.

Select sales_year, sales_profit

From sales;
CONCLUSION:

In conclusion, search optimization techniques in Snowflake

empower you to design and structure your data, queries, and
indexing strategies to efficiently handle search-like operations.

By choosing the right strategies and aligning them with your

specific use cases, you can achieve substantial improvements in
query performance, leading to enhanced data analysis and
insights.

Keep Learning 😊

Free IPTV M3U Playlist HD Channels Download 2025 (1)
76% (33)
Free IPTV M3U Playlist HD Channels Download 2025 (1)
4 pages
Snowflake PPT 22
50% (2)
Snowflake PPT 22
220 pages
Mastering ClickHouse: High-Performance Data Analytics for Modern Applications
From Everand
Mastering ClickHouse: High-Performance Data Analytics for Modern Applications
Robert Johnson
No ratings yet
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Snowflake Faq
No ratings yet
Snowflake Faq
185 pages
Database Management for Business Leaders: Building and Using Data Solutions That Work for You
From Everand
Database Management for Business Leaders: Building and Using Data Solutions That Work for You
Larry Ruddell
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
FTK ToolKit
100% (1)
FTK ToolKit
4 pages
Snowflake Material.docx
No ratings yet
Snowflake Material.docx
2,157 pages
Snowflake Training Slide SANMs
67% (6)
Snowflake Training Slide SANMs
218 pages
Ravi Snowflake Interview Questions-1
No ratings yet
Ravi Snowflake Interview Questions-1
20 pages
Best Practices For Optimizing Your DBT and Snowflake Deployment
No ratings yet
Best Practices For Optimizing Your DBT and Snowflake Deployment
30 pages
Architecture
No ratings yet
Architecture
4 pages
Snowflake - T
No ratings yet
Snowflake - T
108 pages
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Snowflake - Interview Questions
No ratings yet
Snowflake - Interview Questions
15 pages
Access 2016: Up To Speed
From Everand
Access 2016: Up To Speed
R.M. Hyttinen
5/5 (2)
Advanced SQL Performance Tuning: Optimize Your Database Workloads
From Everand
Advanced SQL Performance Tuning: Optimize Your Database Workloads
Robert Johnson
No ratings yet
SQL Database Mastery: Advanced Techniques for Database Management
From Everand
SQL Database Mastery: Advanced Techniques for Database Management
Adam Jones
No ratings yet
Snowflake
No ratings yet
Snowflake
16 pages
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
From Everand
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Best Practices For Using Tableau With Snowflake
No ratings yet
Best Practices For Using Tableau With Snowflake
64 pages
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
chapter3
No ratings yet
chapter3
63 pages
Data Structures Explained: A Practical Guide with Examples
From Everand
Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Snowflake_JOINS_QueryOptimization
No ratings yet
Snowflake_JOINS_QueryOptimization
53 pages
Caching in The Snowflake Cloud Data Platform
No ratings yet
Caching in The Snowflake Cloud Data Platform
11 pages
Cabanasj486 Snowflake Snowpro Core
No ratings yet
Cabanasj486 Snowflake Snowpro Core
6 pages
Select AVG AS From Group BY: 'My - Account' 'My - User' 'My - Password' 'My - Warehouse'
No ratings yet
Select AVG AS From Group BY: 'My - Account' 'My - User' 'My - Password' 'My - Warehouse'
1 page
Snowflake
No ratings yet
Snowflake
7 pages
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
From Everand
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Snowflake
No ratings yet
Snowflake
73 pages
Snowpro Core
No ratings yet
Snowpro Core
58 pages
Snowpro Core (1)
No ratings yet
Snowpro Core (1)
55 pages
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
From Everand
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
Robert Johnson
No ratings yet
SnowPro SelfAssesement Tool
No ratings yet
SnowPro SelfAssesement Tool
11 pages
Snowflake - Syllubus and DBT
No ratings yet
Snowflake - Syllubus and DBT
11 pages
Snowflake notes
No ratings yet
Snowflake notes
2 pages
Snowflake - Search Optimization
No ratings yet
Snowflake - Search Optimization
2 pages
SnowPro Advanced Data Engineer 1
No ratings yet
SnowPro Advanced Data Engineer 1
9 pages
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Snowflake-Interview-Questions-PDF
No ratings yet
Snowflake-Interview-Questions-PDF
6 pages
Learn SQL: Database Management Basics
From Everand
Learn SQL: Database Management Basics
Kiet Huynh
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Elasticsearch Guidebook: From Basics to Expert Proficiency
From Everand
Elasticsearch Guidebook: From Basics to Expert Proficiency
William Smith
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Mastering SQL Optimization_ From Functional to Efficient Queries _ by Yu Dong _ Jul, 2024 _ Towards Data Science1211
No ratings yet
Mastering SQL Optimization_ From Functional to Efficient Queries _ by Yu Dong _ Jul, 2024 _ Towards Data Science1211
13 pages
Performance and Tuning -6
No ratings yet
Performance and Tuning -6
172 pages
Essential Guide to DataStage Systems: Definitive Reference for Developers and Engineers
From Everand
Essential Guide to DataStage Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Decoding Oracle Database: A Comprehensive Guide to Mastery
From Everand
Decoding Oracle Database: A Comprehensive Guide to Mastery
Kameron Hussain
No ratings yet
Iceberg Table Formats and Analytics: Definitive Reference for Developers and Engineers
From Everand
Iceberg Table Formats and Analytics: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
(Views4You - English) Getting Started - Architecture & Key Concepts
No ratings yet
(Views4You - English) Getting Started - Architecture & Key Concepts
6 pages
GCP Snowflake.pptx
No ratings yet
GCP Snowflake.pptx
83 pages
Mastering Trino: The Definitive Guide to Distributed SQL
From Everand
Mastering Trino: The Definitive Guide to Distributed SQL
Robert Johnson
No ratings yet
SQL Mastery: From Novice Queries to Advanced Database Wizardry
From Everand
SQL Mastery: From Novice Queries to Advanced Database Wizardry
Scott Markham
No ratings yet
snowflake note
No ratings yet
snowflake note
35 pages
Teradata Architecture and SQL Essentials: Definitive Reference for Developers and Engineers
From Everand
Teradata Architecture and SQL Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Formal Languages, Automata and Computability
No ratings yet
Formal Languages, Automata and Computability
28 pages
Ssss
No ratings yet
Ssss
2 pages
Lesson 1 Week 1
No ratings yet
Lesson 1 Week 1
12 pages
Word2013 TextWrap Practice
No ratings yet
Word2013 TextWrap Practice
2 pages
Understanding Otdr Understanding-otdr-po-fop-tm-aePo Fop TM Ae
No ratings yet
Understanding Otdr Understanding-otdr-po-fop-tm-aePo Fop TM Ae
1 page
Bangladesh Railway e Ticket
No ratings yet
Bangladesh Railway e Ticket
1 page
SUMIT
No ratings yet
SUMIT
1 page
Compact I/O Expansion Power Supplies: Installation Instructions
No ratings yet
Compact I/O Expansion Power Supplies: Installation Instructions
36 pages
HP Probook 455 G3 Quanta X73A DAX73AMB6E1 Schematics
No ratings yet
HP Probook 455 G3 Quanta X73A DAX73AMB6E1 Schematics
62 pages
FAIR Open Course - Module 02 - The FAIR Model
No ratings yet
FAIR Open Course - Module 02 - The FAIR Model
67 pages
Building Real-Time Marvels with Laravel: Create Dynamic and Interactive Web Applications 1st Edition Sivaraj Selvaraj download
100% (2)
Building Real-Time Marvels with Laravel: Create Dynamic and Interactive Web Applications 1st Edition Sivaraj Selvaraj download
67 pages
Tech11 - JavaScript Developer - Test
No ratings yet
Tech11 - JavaScript Developer - Test
2 pages
LAB 2 CA Superposition
No ratings yet
LAB 2 CA Superposition
3 pages
Hobo Claw
No ratings yet
Hobo Claw
4 pages
Redefining Developer experience with AI and DevOps
No ratings yet
Redefining Developer experience with AI and DevOps
16 pages
Bogen MC2k Install Manual
No ratings yet
Bogen MC2k Install Manual
32 pages
Hongladarom & Van Der Vaerenhsor (2024) Chatgpt, Postphenomenology, And the Human-technology-reality Relations
No ratings yet
Hongladarom & Van Der Vaerenhsor (2024) Chatgpt, Postphenomenology, And the Human-technology-reality Relations
24 pages
116 Dumitrache Petru
No ratings yet
116 Dumitrache Petru
4 pages
Slit Lamp BL 900
No ratings yet
Slit Lamp BL 900
6 pages
Shen Et Al - 2015 - Dynamic Overset Grids in OpenFOAM With Application To KCS Self-Propulsion and
No ratings yet
Shen Et Al - 2015 - Dynamic Overset Grids in OpenFOAM With Application To KCS Self-Propulsion and
20 pages
Salesforce Marketing Cloud Interview Questions Answer 1729537252
No ratings yet
Salesforce Marketing Cloud Interview Questions Answer 1729537252
7 pages
FX3GE Manuel D'installation
No ratings yet
FX3GE Manuel D'installation
50 pages
Design Requirment
No ratings yet
Design Requirment
10 pages
Tutorial 6
No ratings yet
Tutorial 6
12 pages
PMBOK Project Management Methodology Overview 6p
100% (1)
PMBOK Project Management Methodology Overview 6p
6 pages
126407
No ratings yet
126407
37 pages
Arithmetic Sequences
No ratings yet
Arithmetic Sequences
3 pages
Acad Nag
No ratings yet
Acad Nag
108 pages

Query Optimization

Uploaded by

Query Optimization

Uploaded by

SNOWFLAKE QUERY OPTIMIZATION

Query Optimization is crucial in Snowflake (and any database system) for

Performance Improvement: Query optimization in Snowflake is all about

Cost Efficiency: Snowflake, like many cloud-based data warehouses, charges

Resource Utilization: Snowflake, being a cloud-based data warehouse,

Decision-Making: Timely access to accurate data is crucial for making informed

Snowflake’s architecture is a hybrid of traditional shared-disk and shared-

 Snowflake is like a mix of a big shared storage room (shared-disk) and

Query Processing / Compute Layer:

Cloud Services Layer:

1. SNOWFLAKE SERACH OPTIMIZATION SERVICE (S0S)

Search Optimization Service aims to enhance the performance of queries under

Analytical Queries with Extensive Predicates: Analytical queries involve

Both of this are taken care by SOS.

 Snowflake supports semi-structured data types like VARIANT and

 Substring and regular expression searches.

 When Snowflake instance is connected to real-time data sources,

search optimization is most effective for queries that involve

# Adding search optimization for entire table.

Here, I have a table “demodatabase.db_source.script_src_date_day_lookup”,

alter table demodatabase.db_source.script_src_date_day_lookup

Minimizing data movement is a fundamental principle in optimizing query

HOW TO ACHIEVE THIS ??

 Using partition pruning:

When a query includes a WHERE clause that filters on a partition key

 Using appropriate sorting:

Sorting queries are a common type of database query where the

HOW TO IMPLEMENT SORTING ??

4. USE MATERIALIZED VIEW

A materialized view in a database is a pre-computed and pre-stored result

This can be achieved by using appropriate aggregation levels:

CREATE MATERIALIZED VIEW monthly AS

Clustering keys in a database determine the physical organization of data

 Choosing the appropriate clustering key: The clustering key

 Using multi-column clustering keys: If your queries frequently

HOW TO IMPLEMENT THIS??

#CLUSTERING TWO TABLE.

CLUSTER BY (date, product_category);

Maximizing cache usage is a crucial strategy for optimizing query

HOW TO IMPLEMENT CACHING??

ALTER SESSION SET USE_CACHED_RESULT = TRUE

Instead of doing Bulk Data Load try using COPY command.

8. USE SEPARATE WAREHOUSE FOR QUERIES, TASK LOADING AND

Using a TOP or LIMIT clause can indeed improve the performance of

This command takes 2m 5s for its execution.

Whereas when using top its execution time decreases to 6s.

But what if we have complex join operation??

Refer the below figure:

So instead of proceeding with this approach, we should break-down the

With the help of this approach, instead of executing the

It will eliminate the high execution time.

Query profiling, also known as query performance profiling, is a process of

HOW TO DO QUERY PROFILING???

Example : Need sales_profit year wise from sales table.

While using below code,whole sales table will be considered.

Select * from sales;

Select sales_year, sales_profit

In conclusion, search optimization techniques in Snowflake

By choosing the right strategies and aligning them with your

You might also like