0% found this document useful (0 votes)
8 views3 pages

1 Vishal F Script

Vishal Ghonsikar is a Software Engineer with over 4 years of experience, currently working as an Azure Data Engineer at White Horse Technologies in Bangalore. He specializes in designing data pipelines using Azure technologies and has implemented scalable data storage solutions in his latest project for KGL Logistics, focusing on optimizing shipping costs and improving inventory management. The project utilizes Medallion Architecture to process data from CSV format to Parquet format across multiple layers, ultimately integrating with Snowflake for data handling.

Uploaded by

umeshrathod130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

1 Vishal F Script

Vishal Ghonsikar is a Software Engineer with over 4 years of experience, currently working as an Azure Data Engineer at White Horse Technologies in Bangalore. He specializes in designing data pipelines using Azure technologies and has implemented scalable data storage solutions in his latest project for KGL Logistics, focusing on optimizing shipping costs and improving inventory management. The project utilizes Medallion Architecture to process data from CSV format to Parquet format across multiple layers, ultimately integrating with Snowflake for data handling.

Uploaded by

umeshrathod130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Hi Good Morning/good evening.

Thanks for giving me this opportunity to introduce myself.


I am Vishal Ghonsikar. I am basically from LATUR, Maharashtra. , I
completed my graduation from PUNE University,
I have total 4.1 years of experience working as a SOFTWERE ENGINEER.
I HAVE CURRENTLY WORKING AS Azure Data Engineer in – WHITE HORSE
TECHOLOGIES,
Bangalore.
where I specialize in designing and optimizing data pipelines using
Azure technologies like Data Factory, Databricks, Snowflake, and big
data technology like Pyspark and I also have hands-on experience in
SQL and PYTHON. In my latest project, I have implemented scalable
data storage solutions using Azure Data Lake Storage Gen2, and
structured data for analytics using the Medallion architecture across
Bronze, Silver, and Gold layers in Databricks. Additionally, I have
optimized data storage and query performance by refining gold layer
data and loading it into Snowflake, ensuring data consistency.
This is my short introduction.

Project 1
Client: KGL LOGISTICS
Description: KGL Logistics is a prominent logistics company headquartered in Kuwait,
specializing in integrated supply chain management services across the
Middle East. With over 40 years of experience, the company has
established itself as a key player in the logistics sector, offering a
comprehensive range of services tailored to meet diverse business needs.

Our client is aiming to " combining multiple smaller shipments into a


single, larger shipment to reduce shipping costs and
improve transportation efficiency."

 Reduce shipping Cost.

 Improved Inventory Management.

 Reduced Carbon Footprint.

 Simplified Logistics.

my current Project flow

" WHITE HORSE TECHOLOGIES,Bangalore., I am working on an


LOGISTICS project where We follow Medallion Architecture which has
layers like bronze silver, and gold layer.
First of all, We get that data in CSV format only. we get data from the
client side in the ADLS raw container directly, which is our
landing container Then we copy that as data into our bronze
layer container by using copy activity in ADF. Then We
mount the ADLS with Databricks, to run that file in our
Databricks environment. In Databricks notebook in the
bronze layer we generally clean and validate data such as
data type checks, column renaming, and selecting specific
columns, handling null values, filling null values, removing
duplicates, segregating bad records like special characters.

Then cleaned the data we stored into a silver layer container in


parquet format. In the silver layer, we do some
transformations like aggregating the data, using joins,
filtering, etc as per the requirement of the client. If
everything is fine, then we load those transformations into a
gold layer container in a parquet format.

This gold layer is where we make the final changes by converting the
validated data into fact and dimension tables, and we load
the data into ADLS in Parquet format.

So, our upstream data is in CSV format, and the downstream data is in
Parquet format.

After writing the final data to the Gold layer, we use Azure Event Grid
to create an event notification of that file and send it to the
handler. in our case handler is Snowflake Whenever
Snowflake gets a notification then Snowpipe will run. and
with the help of snow pipe

Sources of data
Data is generated from different applications like
1. warehouse management systems (WMS)-
2. enterprise resource planning (ERP):- : SAP ERP, Oracle ERP Cloud
3. transportation management systems (TMS):- : SAP Transportation
Management
4. customer feedback systems
5. environmental impact software:- : Enablon, Sphera

This process involves migrating data from on-premises to the cloud


using Data Migration Service. An ingestion team handles this process,
so I wasn’t directly involved in it

Types of data
Generally, data are ingested in the form of Flat Files & mostly those files in
the form of CSV.
1. Consolidation Hubs data file
2. Load Planning data file
3. Transport mode optimization data file
4. Scheduling and Coordination data file
5. Freight Documentation data file
6. Cost Management data file
7. Carrier Performance Evaluation data file
8. Environmental impact monitoring data file

You might also like