Cabanasj486 Snowflake Snowpro Core
Cabanasj486 Snowflake Snowpro Core
Average salary $90.501 Shared-nothing Each node in the virtual warehouse cluster stores
Duration 120 minutes architectures a portion of the entire data set locally.
Exam Guide https://learn.snowflake.com/en/certifications/sn‐ Hybrid of traditional shared-disk and shared-nothing database archit‐
owpro-core/ ectures.
Architecture
cheatography.com/cabanasj486/
Snowflake SnowPro Core Cheat Sheet
by cabanasj486 via cheatography.com/192752/cs/40080/
On-Demand Fixed rate for the consumed services Snowflake Reads data from Apache Kafka topics and
Pre-paid Cheaper, but commitment to Snowflake Connector for loads the data into a Snowflake table
Kafka
Micro-partitions Third-Party Data You can see the list at the following link.
Integration Tools
All data in Snowflake tables are automatically divided into micro-par‐
titions, contiguous units of storage between 50 and 500MB of The most important part of this section is Snowpipe. You should use
uncompressed data, organized in a columnar way. it for small volume of frequent data, and you load it continuously
They are immutable, meaning they cannot be changed once created. (micro-batches). It's serverless, which means that it doesn’t use
Virtual Warehouses.
Pruning process
It can detect new files by automating Snowpipe using cloud
Technique to analyze the smallest number of micro-partitions to
messaging, or by calling the Snowpipe REST endpoints.
solve a query. It retrieves all the necessary data to give a solution
without looking at all the micro-partitions, saving a lot of time to
Bulk Load
return the result. You can find a real example here
COPY Load data from any stage to an existing table. 64 days
Load Data INTO of metadata.
Bulk Load Loading batches of data from files already available at Some important considerations:
any stage into Snowflake tables 1) You cannot Load/Unload files from your Local Drive
Continuous Load small volumes of data (micro-batches) and 2) Using the Snowflake UI, you can only Load 50MB files
Load incrementally make them available for analysis. 3) Organizing input data by granular path can improve load perfor‐
mance
Default Roles 4) FORCE=True to copy the files again and ommit the 64 days of
metadata.
ACCOUN‐ Top-level role
5) PURGE = True removes the data files from the stage.
TADMIN
5) If there is any error, you can specify different options: ABORT_‐
SECURI‐ Manage users and roles STATEMENT, CONTINUE, SKIP_FILE, SKIP_FILE_num, SKIP_F‐
TYADMIN ILE_num%.
SYSADMIN Create warehouses and databases (and other
objects) Cache Strategies
USERADMIN User and role management Metadata Objects Information & Statistics.
PUBLIC Automatically granted to every user an role Cache
CUSTOM Create your own roles and assign the privileges Warehouse Attached SSD storage to a Warehouse. Information
that you want Cache lost when the Warehouse is suspended.
Query It stores the results of our queries for 24 hours. If we
Continuos Load Result perform the same query and the data hasn’t changed,
Snowpipe Loading data when the files are available in any (inter‐ Cache it will return the same result without using the
nal/external) stage. 14 days of metadata. Warehouse.
You can find a complete example of how to use the different cache
strategies in the following link.
cheatography.com/cabanasj486/
Snowflake SnowPro Core Cheat Sheet
by cabanasj486 via cheatography.com/192752/cs/40080/
Discretionary Access Each object has an owner who can, in turn, Types of Full account (existing Snowflake account), and
Control (DAC) grant access to that object Consumers Reader Account (share data with someone without
Role-Based Access Access privileges are assigned to roles, Snowflake account).
Control (RBAC) which are, in turn, given to users Shared data is instantaneous for consumers as no actual data is
copied or transferred between accounts. For this reason, shared
Access Management in Snowflake data is always up-to-date, and consumers don't pay for storage.
User Person or program
Streams
Role Entity to which we grant privileges
Definition Snowflake objects that record data manipulation
Securable Object Entity to which we can grant access
language (DML) changes made to tables and views,
Privilege Defined level of access to an object
including INSERTS, UPDATES, and DELETES, as well
as metadata about each change
Other Concepts
Storage They don't contain table data; they only store offsets
Parnet Technology & Solution partners
Types Standard, Append Only, and Insert Only
Connect
Columns METADATA$ACTION, METADATA$ISUPDATE,
Compliance HITRUST / HIPAA, ISO/IEC 27001, FedRAMP
METADATA$ROW_ID
Moderate, PCI-DSS, etc
Function that indicates whether a stream contains
Data Market‐ For providers to buy or sell their datasets. Free,
change data capture (CDC) records
place Personalized, and Paid Listings
Another important function is SYSTEM$STREAM_HAS_DATA,
Column Level Dynamic Data Masking & External Tokenization
which that indicates whether a stream contains change data capture
Security
(CDC) records. You can see a example of how streams work in the
following link.
Data Sharing
Sequences
cheatography.com/cabanasj486/
Snowflake SnowPro Core Cheat Sheet
by cabanasj486 via cheatography.com/192752/cs/40080/
Snowflake Objects
Cloud Providers
Connect to Snowflake
Types of tables
Web Interface
Permanent
SnowSQL (CLI Client)
Transient
ODBC
Temporary
JDBC
External
SDK for Node, Python, Kafka, Go, and more!
Types of views
Snowflake Objects
Regular
Account Must be unique.
Materialized
Warehouse Virutal Machine to execute queries. Compute Part.
Secure
Database Logical Collection of Schemas.
Schema Logical Collection of Objects. The Public schema and
the Information_Schema are created when creating a
Database.
cheatography.com/cabanasj486/
Snowflake SnowPro Core Cheat Sheet
by cabanasj486 via cheatography.com/192752/cs/40080/
Named External Stage Use Cases Access historical data at any point within a defined
Named Internal Stage period. Useful to restore tables.
User Internal Stage (@~) Objects that we Databases, Schemas, and tables.
can restore
Table Internal Stage (@%)
Retention 1 day by default, with a maximum of 90 days
Storage Integrations will enable users to avoid supplying credentials
Period (Enterprise edition).
when creating stages or when loading or unloading data. It's an
Ways to restore By offset, query statement ID, or timestamp.
object that stores a generated identity and access management
(IAM) entity for your external cloud storage. Example UNDROP TABLE mytable;
METADATA$FILE_R‐ Row number for each record in the Store Extend Snowflake SQL by combining it with
OW_NUMBER container staged data file. Procedures JavaScript
User-D‐ Perform operations that are not available through
Fail-Safe efined Snowflake’s built-in, system-defined functions. SQL,
Use It ensures historical data is protected in the event of a Functions JavaScript, Java, and Python. It returns a single row
Cases system failure or other catastrophic event (UDFs)
Retention NON-CONFIGURABLE 7-day period User-D‐ They can multiple rows for each input row (only
Period efined difference with UDFs)
Table
Example No, you cannot recover this data alone; you MUST ask
Functions
Snowflake support
(UDTFs)
Note: Fail-Safe requires additional storage, which will be reflected in
External They call code that is executed outside Snowflake
your monthly storage charges
Functions
Zero-Copy Cloning
Tasks
Use Create a snapshot of any table, schema, or Database
Definition Schedulable scripts that are run inside your Snowflake
Cases
environment
Cost FREE, it doesn’t consume storage. It does NOT
When Task run on a schedule
duplicate data; it duplicates the metadata of the micro-‐
they run
partitions.
Execution They execute a single SQL statement, including a call
Other Privileges are not cloned. Data History is not cloned.
to a Stored Procedure
consid‐
erations Duration Maximum duration of 60 minutes by default
Note: When you modify some cloned data, it will consume storage
because Snowflake has to recreate the micro-partitions, which will
cost money.
cheatography.com/cabanasj486/
Snowflake SnowPro Core Cheat Sheet
by cabanasj486 via cheatography.com/192752/cs/40080/
Tree of Each task can have a maximum of 100 children tasks. Size Impact the amount of time required to execute
tasks A tree of tasks can have a maximum of 1000 tasks, queries
including the root one. Multi-Cluster Scale compute resources to manage query
Task Query the history of task usage within a specified date Warehouses concurrency
History range Multi-Cluster Maximized & Auto-scale
Serverless Compute resources automatically scale up or down by Warehouses
Tasks Snowflake as required for each workload Modes
Note: Snowflake ensures only one instance of a task with a schedule Scaling Scale up/down to increase performance. Scale
is executed at a given time. If a task is still running when the next out/in to improve concurrency for users/queries.
scheduled execution time occurs, that scheduled time is skipped. Scaling policies Standard & Economy
Auto Suspend & Enabled by default.
Transactions
Auto Resume
ACID. Sequence of SQL statements that are committed or rolled
A Data Warehouse is a cluster of computing resources in Snowflake
back as a unit. Things we need to know for the exam:
that provides CPU, memory, and temporary storage to perform
1) Snowflake takes 4 hours to abort it if we do not abort it with the
queries and DML operations. While a warehouse is running, it
SYSTEM$ABORT_TRANSACTION
consumes Snowflake credits. It utilizes utilizes per-second billing
2) Each transaction has independent scope.
(with a 60-second minimum each time the warehouse starts).
3) Snowflake does not support Nested Transactions
Other commands
Resource Monitors
PUT UPLOAD files from a local directory/folder on a client
Credit Snowflake credits allocated to the monitor for the
machine into internal stages.
Quota specified frequency interval
GET DOWNLOAD files from a Snowflake internal stage into a
Monitor Monitor the credit usage for your entire Account or
directory/folder on a client machine
Level individual Warehouses
Example GET @my_int_stage file:///tmp/data/;
Schedule When the monitor is going to start monitoring
You cannot use both of these commands with the Snowflake Web
Actions What to do when the threshold is reached. Notify (send
UI.
notification), Notify & Suspend (suspend warehouse), or
Notify & Suspend Immediately (kill query).
cheatography.com/cabanasj486/