0% found this document useful (0 votes)
110 views3 pages

Azure Data Engineer

The document discusses various Azure data and analytics services including Azure Data Factory, Azure Databricks, PySpark, Python and SQL Server. It provides an overview and examples of key concepts and tasks for extracting, transforming and loading data using these services.

Uploaded by

T Pavann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views3 pages

Azure Data Engineer

The document discusses various Azure data and analytics services including Azure Data Factory, Azure Databricks, PySpark, Python and SQL Server. It provides an overview and examples of key concepts and tasks for extracting, transforming and loading data using these services.

Uploaded by

T Pavann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Basics:


On Prem vs cloud
⮚ Azure cloud, region, zone
⮚ Portal creation
⮚ Hierarchy
⮚ Subscriptions
⮚ Resource groups
⮚ Virtual network(vnet)
⮚ Storage accounts (LRS, GRS, ZRS, GZRS)
⮚ Key vaults
1. Azure data factory (Extract Transform Load)
⮚ Azure data factory service creation
⮚ Integration runtimes – gateway source to destination
✔ Azure IR
✔ Self-hosted IR
✔ SSIS IR
⮚ Linked services – connection string.
⮚ Data sets – represents data
⮚ Pipelines
⮚ Activities
✔ Lookup activity
✔ Get metadata activity.
✔ Filter activity
✔ If activity
✔ Foreach activity
✔ Copy activity
✔ Stored procedure activity

Scenario1: Filter activity in ADF Dynamic copy

Scenario2: Get file names from folder dynamically.

Scenario3: copy activity behavior

Scenario4: Validate copied data between source and sink in ADF.

Scenario5: Load data from on premise sql server to Azure SQL DB in Azure Data Factory

Scenario6: Copy Data from on premise to Azure SQL DW with polybase _ with Bulk Insert

Scenario7: Copy Data from on-premise File System 2 ADLS Gen2 Install Self Hosted IR

Scenario8: Copying data from Salesforce to ADLS Gen2 using ForEach _ Copy in ADF V2

⮚ Azure SQL
✔ Introduction
✔ Creating first SQL database deployment
✔ Copying data from azure SQL to azure blob.
✔ Copy multiple tables in bulk with lookup and foreach activity in ADF.
✔ Use foreach loop activity to copy multiple tables.
✔ Incremental load and delta load from SQL to Blob storage.
⮚ Triggers
✔ Event based trigger.
✔ Scheduled trigger
✔ Thumbling window trigger
⮚ Azure data flows
⮚ Slowly changing dimensions type1 and type2

2. Azure data bricks


⮚ Introduction
⮚ Databricks creation
⮚ Workspace, Data management, Computation management,
⮚ Types of clusters
⮚ Notebooks
⮚ Widgets
⮚ DBFS
⮚ Reading and writing files from blob storage to azure data bricks
∙ Excel, Parquet, XML, JSON
⮚ Reading and writing data from azure SQL database to azure data bricks
3. PySpark
⮚ Spark and Pyspark introduction
⮚ RDD
⮚ Data frames
⮚ Joins and types.
⮚ Functions
⮚ Delta tables
⮚ SCD type 1 and type 2
4. Python
✔ Introduction
✔ Syntax
✔ Comments and variables, data types
✔ Operators, lists, tuple, sets, Dictionaries
✔ If else, while, for loop
✔ Functions
✔ Lambda
✔ Arrays
✔ Classes/Objects, Inheritance, Polymorphism
✔ Modules
✔ Exception handling
5. SQL Server
✔ Introduction and syntax
✔ Create database, Drop database.
✔ Create table, drop, alter.
✔ Constraints, not null, unique, primary key, foreign key
✔ Select, select distinct, where, And or not, order by, insert into, null values.
✔ Update, delete, select top, min and max, count, avg, sum, like, in
✔ Between, alias, joins, having, Stored procedures.

You might also like