0% found this document useful (0 votes)
19 views

Cache Cheatsheet

Uploaded by

Richard Smith
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Cache Cheatsheet

Uploaded by

Richard Smith
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

FUNDAMENTALS

CACHE

Sebastian Flak
There are 3 types of cache:

VIRTUAL WAREHOUSE
QUERY RESULT
METADATA

Each of them serving different


purpose.

Sebastian Flak
Virtual Warehouse cache
Also known as main or compute cache.
When queries are run, Snowflake uses the compute
resources of virtual warehouses to perform the
operations.
Each virtual warehouse has its own locally attached
cache for data it accesses.
If subsequent queries can be served by the data in the
warehouse's cache, they will be executed faster
because they avoid accessing data storage.
The warehouse cache is particularly effective during
sessions where multiple queries are run that access the
same data sets.

Sebastian Flak
Virtual Warehouse cache
No cache With cache

Sebastian Flak
Virtual Warehouse cache
If the virtual warehouse is suspended, the cache
is lost!
Find balance between suspending warehouse
“too early” and try to keep it long enough to use
cache.
Monitor your usage and then decide when to
turn it off.
You can do it with:

ALTER WAREHOUSE COMPUTE_WH SET


AUTO_SUSPEND=60;

Sebastian Flak
Query Result cache
Snowflake automatically caches the results of
every query for 24 hours.
If the same query is executed again and the
underlying data has not changed, Snowflake
retrieves the result from the cache rather than
re-executing the query against the database.
“All or nothing“ mechanism. You either use it or
not - there is no partial benefit.
Work perfectly for repeated queries, such as
those often run in business intelligence tools.

Sebastian Flak
Query Result cache
Before After

Sebastian Flak
Query Result cache

This cache won’t work if you are using functions


which needs to calculate every time e.g.
CURRENT_TIMESTAMP()
In case you want to turn it off for testing (or other
stuff you do), run:

ALTER SESSION SET USE_CACHED_RESULT = FALSE;


This cache is shared among users.

Sebastian Flak
Metadata cache

Snowflake maintains metadata about all objects


in its service including micro-partitions.

Helps in optimizing query planning and execution


by keeping this data readily accessible, thus
reducing the time it takes to start processing
actual query results.

You have no control over this cache.

Sebastian Flak
Metadata cache
Total number of rows example:

Sebastian Flak
Metadata cache

Works perfectly when:

You want to get total number of rows in a


table - COUNT(*)

Look for MIN or MAX values

Need to get number of DISTINCT values


in a column

Sebastian Flak
Did you like it?

Share your thoughts

Repost

Find me on YouTube

Sebastian Flak

You might also like