0% found this document useful (0 votes)
17 views

File_Decompression_Using_Copy_Activity_in_ADF_1740623959

This document outlines the steps to decompress files using Copy Activity in Azure Data Factory, detailing both single and multiple file decompression processes. It includes instructions for setting up Azure storage containers, creating datasets, configuring compression settings, and executing pipelines. The final outcome is the successful extraction of files into a specified output container in .csv format.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

File_Decompression_Using_Copy_Activity_in_ADF_1740623959

This document outlines the steps to decompress files using Copy Activity in Azure Data Factory, detailing both single and multiple file decompression processes. It includes instructions for setting up Azure storage containers, creating datasets, configuring compression settings, and executing pipelines. The final outcome is the successful extraction of files into a specified output container in .csv format.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Steps to Decompress Files Using

Copy Activity in Azure Data Factory

Praveen Patel | Azure Data Engineer


Steps to Decompress a Single File
Step 1
In the first step open Azure Portal go to storage account to open ADLS account and
create two containers container1 to unzip stored zip files and containeroutput to store
unzip file.

Praveen Patel | Azure Data Engineer


Step 2
Go to container1 to upload zip files from my computer i uploaded two files
companyhierarchy.csv.gz and samplefolder.zip first I will show to decompress
companyheirarchy.csv.gz file. Once you click on the first file to open you get unreadable
data.

Praveen Patel | Azure Data Engineer


Step 3
Go back to Azure Portal open Azure Data Factory go to Author Tab then click on '+' icon to
create a new pipeline and add copy activity.

Praveen Patel | Azure Data Engineer


Step 4
Go to source create a new source dataset for ADLS Gen2 click on continue select format as
delimited text.

Praveen Patel | Azure Data Engineer


Step 5
Make a new dataset name select linked service click on browse button go to container1 and
select zip file proceed with ok.

Praveen Patel | Azure Data Engineer


Step 6
Once a new source dataset creates click on open go to connection select compression type
as gzip (.gz) and compression level as Fastest.

Praveen Patel | Azure Data Engineer


Step 7
Go to sink similarly create a new sink dataset for ADLS Gen2 select format as delimited text
Make a new sink dataset name click on browse button to select containeroutput container
to store unzip file click on ok to proceed. Once dataset creates open it and keep all things
remain same in connection.

Praveen Patel | Azure Data Engineer


Step 8
Now execute pipeline using debug mode once pipeline executes successfully go back to
ADLS storage.

Praveen Patel | Azure Data Engineer


Step 9
Go to containeroutput container you see a file that is unzip in .csv format when you click on
file you see a tabular data.

Praveen Patel | Azure Data Engineer


Simple Steps to Decompress All files together

Earlier you saw the demonstration to decompress a single file now you will see the
implementation to decompress multiple files together. Go back to ADLS Gen2 account it
has container called container1 that contains Zipped file folder which has four zip files
further you will learn to decompress all files.

Praveen Patel | Azure Data Engineer


Step 1
Go back to Azure Data Factory use the same pipeline go to source to open source dataset
go to connection click on browse to select Zipped Files folder don’t select any zip file.
Select Compression type as .Zip and Compression Level as Fastest.

Praveen Patel | Azure Data Engineer


Step 2
Go to copy activity source select File path type as Wildcard File Path. In wildcard path
write *.zip to get all the zip files keep remains wildcard folder path empty keep remains
sink path as containeroutput.

Praveen Patel | Azure Data Engineer


Step 3
Now execute pipeline using debug mode wait 5 sec to 10 sec until pipeline executes
successfully.

Praveen Patel | Azure Data Engineer


Step 4
Once pipeline executes successfully go back to ADLS storage account then go to
containeroutput container you see all the files that decompressed successfully.

Praveen Patel | Azure Data Engineer


Praveen Patel
Azure Data Engineer

Follow Me to Get Such


More Content Like This

You might also like