Automate Your SQL Execution With Snowflake Tasks
Automate Your SQL Execution With Snowflake Tasks
Tired of manually running the same SQL scripts day after day? Struggling to
orchestrate complex ELT workflows that ingest, transform, and load data on schedule?
Automating repetitive and routine data tasks is crucial for efficiently managing modern data
pipelines. Snowflake Tasks provides a powerful automation framework to simplify
scheduling and managing Snowflake SQL execution, enabling you to build automated,
continuous ETL processes that run like clockwork.
In this article, we will walk through how to create, schedule, monitor, and manage
Snowflake Tasks for automating Snowflake SQL execution. We will cover key concepts like
Snowflake task types, scheduling mechanisms, task dependencies—and a whole lot more!!
Snowflake tasks allow you to automate the execution of Snowflake SQL statements, stored
procedures, and UDFs on a scheduled basis. They are useful for automating repetitive data
management and ELT processes in Snowflake. Snowflake tasks provide a framework for
scheduling multi-step data transformations, loading data incrementally, maintaining data
pipelines, and ensuring downstream data availability for analytics and applications.
Snowflake tasks are decoupled from specific users, so they can continue to run even if the
user who created them is no longer available. They are also serverless, so Snowflake only
provisions the compute resources needed to run the task, and then releases those resources
when the task is finished. This can help save costs.
Snowflake tasks allow you to automate Snowflake SQL statements, stored procedures, data
load operations—and a whole lot more.
Only one Snowflake SQL statement is allowed per Snowflake tasks.
Snowflake tasks support dependencies, so you can chain together a sequence of operations.
Snowflake tasks can be monitored in real-time as they execute. You can view Snowflake
tasks history, status, and results within Snowflake.
Permissions on Snowflake tasks allow you to control who can create, modify, run, or view
them.
Snowflake handles all task dispatching, parallelism, queuing, and retry handling.
Snowflake tasks auto-scale across your Snowflake compute.
Common use cases for Snowflake tasks include ELT processes/pipelines, refreshing
materialized views, schedule queries to update dashboards, and orchestrating multi-step
workflows.
Snowflake Tasks provide a simple way to schedule and automate the Snowflake SQL
execution of SQL statements, stored procedures, and UDFs. Tasks remove the need to rely
on external schedulers, orchestration tools or external or third-party workflow tools/engines.
WAREHOUSE = MY_WH
AS
-- SQL statement
Step 4: Make sure to specify the virtual warehouse to use for running the task. This should
have sufficient resources for the Snowflake Tasks operation.
With serverless Snowflake tasks, Snowflake automatically manages the compute resources
required. You don't have to specify a warehouse.
AS
USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE = 'XSMALL'
AS
With user managed Snowflake tasks, you specify an existing virtual warehouse when
creating the task. This allows you full control over the compute resources.
For example:
WAREHOUSE = my_wh
AS
NON-CRON notation allows interval-based scheduling. You specify a fixed time interval at
which the task should run.
The downside is that you cannot specify a particular runtime. The task will run at an interval
relative to its start time.
2) CRON notation (Time-based scheduling)
CRON notation provides powerful, time-based scheduling. You can specify a particular time
for the task to be executed.
The asterisks represent minute, hour, day of month, month and day of week respectively.
Here are some few good websites that can help you understand cron scheduling better and
write cron expressions for you:
Here are some useful websites that can help you better understand cron scheduling and
generate cron expressions:
Crontab.guru : Interactive cron expression generator and explainer, which allows you to
visually build cron schedules.
EasyCron : Online cron expression generator with predefined cron schedule examples.
CronMaker : A simple cron generator with an interactive interface.
CronTab-generator : Helps you build cron expressions online along with examples.
Step-by-Step Process of Managing Snowflake Tasks
Once Snowflake tasks are created, you can control the state of a Snowflake task using
the ALTER TASK command.
But first, to check the status of all the Snowflake tasks you can make use of the SHOW
TASKS command.
SHOW TASKS;
This returns information on each run including status, start time, end time, failures etc.
Snowflake Tasks Trees allow you to create dependencies between tasks to automatically
execute them in a sequential workflow. The key components of a Snowflake Tasks Tree are
a root parent task, child tasks, task dependencies, a single task owner, and an overall tree
structure.
The root task in Snowflake sits at the pinnacle of the tree. It operates on a defined schedule,
which can be set using either CRON or interval-based notation. This root task initiates the
entire workflow, running autonomously based on its schedule. Every other task in the tree is
either directly or indirectly dependent on this root task.
Child tasks stem from the root or other parent tasks. Unlike the root, child tasks don't
require their own schedule. Instead, they're set to run "after" a designated parent task has
finished. This configuration establishes the task dependencies that shape/structure the tree.
The dependencies control the order of execution — each child task will only start after its
defined parent completes successfully, which enables automatically cascading and
orchestrating tasks into a larger sequential process flow.
Every task within the tree must have a common task owner—a role equipped with the
necessary privileges. Also, all Snowflake tasks should be located within the same database
and schema.
From a structural standpoint, the tree resembles a B-tree hierarchy, with the root at the top
and child tasks branching out and forming layers beneath. Check out the diagram below 👇
for an even better understanding of the concept.
Snowflake tasks tree
Snowflake Task Tree Limitations:
Here is a sample Snowflake task tree that runs a simple ETL pipeline:
AS
FROM @s3_stage;
WAREHOUSE = COMPUTE_WH
AFTER load_raw_data
AS
FROM raw_data;
WAREHOUSE = COMPUTE_WH
AFTER transform_data
AS
This will:
The EXECUTE TASK command allows manually triggering a one-time run of a Snowflake
task outside of its defined schedule. This is useful for ad-hoc testing or execution of a task.
Step 1: Use the EXECUTE TASK command and specify the name of the task:
This will immediately run this task to generate the report without waiting for its defined
schedule.
Snowflake tasks offer a powerful way to implement continuous ELT pipelines by combining
them with Snowflake table streams. The Snowflake streams can capture real-time changes to
source tables, while the tasks process the change data incrementally.
Specifically, you can create a table stream on a source table to buffer INSERT, UPDATE,
and DELETE operations. A task can then be defined to poll the stream on a scheduled
interval using SYSTEM$STREAM_HAS_DATA(). This function checks if the stream has
any new change data.
If there is new data, the task will run a query to extract the changed rows from the stream.
For example, it can insert only the new INSERT rows into a separate audit table. If the
stream has no new data, then the scheduled task will simply skip the current run.
In this way, the stream acts as a change data capture buffer, while the task handles the
incremental processing. Together they provide an efficient way to build scalable ELT
pipelines that react to real-time changes in the source system. The task polling model
ensures that load on the source is minimized by only querying for new changes at defined
intervals.
TL;DR: Snowflake streams and Snowflake tasks provide an efficient solution for
continuous integration workflows, addressing the challenge of managing and
responding to ongoing data changes.
This will create a stream that will buffer all INSERT operations on the table.
WAREHOUSE = my_wh
AS
WHEN
SYSTEM$STREAM_HAS_DATA('my_insert_stream')
This will allow the task to detect if there are any new inserts in the stream.
Step 4: If new data is available, query the stream to retrieve the new INSERT rows.
This will insert all the new rows from the stream into the audit table.
This will start the task and begin processing insert data from the stream.
Snowflake tasks provide a convenient way to schedule the execution of stored procedures
automatically. By defining a task that calls a procedure on a timed schedule, you can set up
regular and recurring Snowflake SQL execution logic encapsulated within procedures. This
avoids having to manually call the procedures each time.
Step 1: Create a stored procedure containing the Snowflake SQL logic that needs to run on
a schedule:
LANGUAGE javascript
AS
$$
// Procedure logic
$$;
WAREHOUSE = my_wh
AS
CALL my_stored_proc();
Creating a Snowflake tasks to call stored procedure -
serverless snowflake
The schedule interval defines how often the task will execute. Here it is set to hourly.
Step 3: Verify the task was created successfully by checking the task list:
SHOW TASKS;
Step 5: Finally, review Task History to inspect the execution details and history of the
Snowflake tasks by using TASK_HISTORY() function.
Conclusion
Snowflake Tasks enable robust native scheduling and automation of SQL statements, stored
procedures, and orchestrated pipelines within Snowflake. Key features include configurable
scheduling, dependency management, incremental workflows via Snowflake table streams,
monitoring, and troubleshooting. With simple commands to manage tasks, Snowflake Tasks
provide powerful workflow automation without external tools. In this article, we provided
an in-depth overview of Snowflake Tasks and a comprehensive guide to:
FAQs
What types of SQL statements can Snowflake Tasks execute?
Snowflake tasks can execute INSERT, UPDATE, MERGE, DELETE statements as well as
call stored procedures. They allow most DML operations useful for ETL or data
manipulation.
No, a Snowflake task can only contain one SQL statement or call to a stored procedure. For
multi-statement workflows, create a stored procedure and invoke it from the task.
There are no hard limits on the number of tasks per account, it depends on warehouse sizes.
There are recommended limits of around 200 tasks per virtual warehouse.
No, Snowflake tasks can only execute SQL statements and stored procedures. For more
complex logic in Python, Java etc, external schedulers would be required.
Directed Acyclic Graph (DAG) is a series of tasks with a single root task and additional
tasks organized by their dependencies. DAGs flow in one direction, ensuring tasks later in
the series don't prompt earlier tasks.
When the root task of a DAG is suspended, you can still resume or suspend any child tasks.
If a DAG runs with suspended child tasks, those tasks are ignored during the run.
How does Snowflake handle task versioning?
When a task is first resumed or manually executed, an initial version is set. After a
Snowflake task is suspended and modified, a new version is set upon resumption or manual
execution.
Yes, session parameters can be set for the session in which a task runs using the ALTER
TASK command. However, tasks do not support account or user parameters.
EXECUTE TASK command manually triggers a single run of a scheduled task, useful for
testing Snowflake tasks before enabling them in production.
Task history can be viewed using SQL or Snowsight. Roles with specific privileges, like
ACCOUNTADMIN, can use Task_History() function to view task history.
Costs vary based on the compute resource source. User-managed warehouses are billed
based on warehouse usage, while Snowflake-managed resources are billed based on actual
compute resource usage.
How does Snowflake handle task scheduling and Daylight Saving Time?
The cron expression in a task definition supports specifying a time zone, and tasks run
according to the local time for that zone. Special care is needed for time zones that
recognize daylight saving time to avoid unexpected task executions.
How does Snowflake ensure that Snowflake tasks are executed on schedule?
Snowflake ensures only one instance of a task with a schedule is executed at a given time. If
a task is still running when the next scheduled execution time occurs, that scheduled time is
skipped.
Pramit Marattha