De Mod 5 Deploy Workloads With Databricks Workflows
De Mod 5 Deploy Workloads With Databricks Workflows
Workloads with
Databricks
Workflows
Module 05
Introduction to Workflows
Building and Monitoring Workflow Jobs
DE 5.1 - Scheduling Tasks with the Jobs UI
DE 5.2L - Jobs Lab
Orchestration of Machine Learning Tasks Arbitrary Code, External Data Ingestion and
Dependent Jobs API Calls, Custom Tasks Transformation
Run MLflow notebook task
Jobs running on schedule, in a job Run tasks in a job which ETL jobs, Support for batch
containing dependent can contain Jar file, Spark and streaming, Built in data
tasks/steps Submit, Python Script, SQL quality constraints,
task, dbt monitoring & logging
Sequence Funnel
● Data transformation/ Fan-out, star pattern
● Multiple data sources
processing/cleaning ● Single data source
● Data collection
● Bronze/silver/gold tables ● Data ingestion and
distribution
ML feature extraction
E.g. MLflow
Tasks
Job run
©2023 Databricks Inc. — All rights reserved 19
Monitoring and Debugging
Repair a Failed Job Run
Use Runs tab to view completed or Use Tasks tab to modify or add
active runs for the job tasks to the job