Azure Data Engineering Course Content Day Wise.
Azure Data Engineering Course Content Day Wise.
Day-1
*****
Day-2
*****
Day-3
*****
Authentication methods.
Account key.
SAS (Shared Access Signature).
Service principle.
Managed identity.
Understanding of ACL (Access Control List).
Built-in roles, custom roles.
ADF (Azure Data Factory) - walkthrough of all the options in Data Factory.
What is integration runtime? Different types and uses.
What is linked services and how to create them using various methods.
What is a dataset and how to create it in different ways.
Different types of activities available in ADF.
Practical exercises on simple copy activities between databases, databases to Azure Data Lake
Storage (ADLS), and ADLS to ADLS.
Day-8
*****
Deep dive into copy activity: Understanding all the tabs and options available/ pipeline optimization
Day-9
*****
Day-10
*****
Explore foreach, if condition, switch activity, until activity, execute activity, validation activity, filter
activity, set variable, append variable, delete activity.
Day-11
*****
Work with activities like, stored procedure activity lookup activity, get metadata activity
Day-12
******
Day-13
*******
Explore triggers in ADF, including schedule triggers, tumbling triggers, and event-based triggers.
Day-14
******
Understand full load and incremental data loading and various methods to achieve it.
Day-15
******
Day-16
******
Day-17
******
Understand DBFS (Databricks File System) and mounting, including different mounting methods
(using Account key, SAS, Service Principle).
Understand the Dbtuils
Day-18
******
Introduction to Python.
Understand Python data types theoretically: string, int, list, tuple, set, dictionary.
Conditions: if condition, while loop, for loop
Day 19
******
List, list related methods, list comprehensions,
Functions: How to create functions, parameterization of functions.
Lambda functions
Day 20
*******
How to pass function as parameter to other function
Python in-built function – Map, reduce, filter and so on
Day-21
******
Tuple, set, dictionary – methods.
Day-21
******
Explore serialization and deserialization.
In-depth understanding of different big data file formats: Parquet, Avro ORC, CSV, JSON, Delta.
Day -22
******
Learn how to read different file formats using PySpark.
Write data to different file formats.
Explore options for each file format.
Day-26
******
Understand RDD (Resilient Distributed Dataset) and a few important RDD functions.
Day 29
******
What is MapReduce.
Brief understanding of HDFS architecture.
why Hive came into the picture.
Unity catalog
What is a meta store and catalog.
Managed table vs. external table.
In-depth understanding of Spark architecture, covering lazy evaluation, fault tolerance, DAG
(Directed Acyclic Graph), lineage, checkpointing.
Wide and narrow transformations.
Types of clusters and modes of clusters in Databricks.
What is auto-scaling and jobs.
Catalyst optimizer.
Day-33
*******
Practical understanding of concepts like cache, persist, broadcast, accumulator, and df.explain.
Day-34
******
Spark job debugging,
Medallion architecture,
workflows.
Day 35
******
Delta live tables & unity catalog
Day-36
******
Data modelling at a high level – (conceptual, Logical & physical data model. Fact & Dimensions. Star
& Snow flake schema. Normalization and Denormalizations.)
Day-37
******
Data flows in ADF
Day-38
******
Day-39
******
GIT configuration in Databricks and ADF.
Creating branches, understanding the main branch, feature/common branches, developer branches.
Day – 40
********
Resume building, Important questions discussion.