0% found this document useful (0 votes)
31 views5 pages

Prashant Kumar CV PDF

Uploaded by

agrawalar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views5 pages

Prashant Kumar CV PDF

Uploaded by

agrawalar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Prashant Kumar

ETL/ELT Developer
Mob: +91-8076656908/9891822705
Email: [email protected]

Total Experience Summary and Accomplishments


2.2 Years in ETL Development & DW
Designing ▪ 2.2 Years of Experience as an ETL/ELT developer with hands on experience
-------------------------------------- on AWS and Snowflake.
Qualification ▪ End to End Project experience on AWS, Snowflake & Informatica Power
B.Tech ( Computer Science ) Center.
▪ Experience of ETL/ELT developing complete pipeline, pulling data from
Work history InfoCepts sources, loading data in databases by performing several transformations.
Technologies July 2019 - Present
▪ Experienced on Data integration projects including data migration:
Work Summary • Ex: Cloud Migration: Teradata to Snowflake
▪ ETL / ELT Developer. ▪ Experienced on migrating projects from On-prem to Cloud Migration.
▪ Snowflake Development.
• Ex: Informatica mappings conversion to ELT.
▪ Talend Development
▪ AWS Solutions Development. ▪ Experience of migrating existing projects from one AWS account to new
▪ First-hand experience on Data AWS account.
Analysis & mockup presentation for ▪ Expertise in SQL for designing and optimizing warehousing and reporting
CXO’s & XVP’s. solution.
▪ Project plan development.
▪ Proven ability to successfully work for multiple requirements, design,
and development approaches.
Certification ▪ Worked on PII data with data encryption in transition and data masking at
rest.
• AWS Certified Cloud ▪ Extensive and first-hand experience in coordination with CXOs and
Practitioner.
XVPs level of executives for requirement gathering and mockup
presentation.
• Snowflake SnowPro Course
Completion Certificate. ▪ Extensively worked on creating project plan and project estimation.
▪ Created ER and Data models to derive data insights in DI projects.
• Microsoft Azure certified in Data ▪ Experienced on overall life cycle of Data warehousing projects from
Fundamentals.
requirement gathering to production deployment.
• MicroStrategy certified to work ▪ Hands on experience on data warehousing of Media and Entertainment
as a Data Analyst. domain data.
▪ Experience of working on complex projects by analyzing whole data,
• Microsoft certified in Software provided analysis to clients for decision making process.
Fundamentals.
▪ Experienced in automating daily manual tasks through the use of available
resources at hand.
Awards
▪ Worked with source teams for re-structuring sources to fulfill client data
• 2 Gem of the Month at org level. changing requirements.
▪ Client engagement experience from requirement gathering and presenting
• 3 Synergy Team award.
solutions to various problems.
• 5 Spot award. ▪ Experience in using ticketing tools like Service-now to resolve issues in the
Enterprise Data Warehouse
• 2 client appreciations for zero ▪ A strong team player, collaborating well with the team to solve problems
bug and no downtime releases. andactively incorporating input from various sources by working with others
on aglobal basis.
Technical Skills

Cloud Platforms AWS, Azure, Snowflake


AWS Services ECR, ECS, FARGATE, EC2, Tasks, S3, RDS (Aurora & Postgres, SQL Server), Secret Manager,
Lambda, Redshift, CloudWatch, CloudTrail.

ETL Tools Talend Big Data, Informatica PowerCenter

Databases Snowflake, MySQL, SQLServer, Oracle

Analytical Languages SQL

Scripting Languages Unix, Python (Both at beginners’ level)

Job Automation Control M, AWS Scheduled tasks via cluster

Security implementations Data masking, Data encryption, Role based access

Data Modeling Data profiling, Understanding the data, creating data model (LDM & PDM), Deriving
insights from the data for Business/client.

Project Profiles
Client Name Deliverable:
America’s largest media and ▪ To create a warehouse through which we can track all the cloud cost at the most granular
Entertainment level possible for the media house.
conglomerate.

Project Name Approach:


Cloud Cost Monitor ▪ We need to integrate the billing data for the complete media house which spans to 4
different cloud providers: AWS, Azure, Snowflake and Databricks into one single data
Role warehouse and the requirement give to us was a one liner from the VP of the Data
ETL Pipeline Developer Management: “I want something by which I can track all my cloud cost at the most
Technology Stack granular level possible".
Talend, Snowflake
Complexity:
▪ The major challenge was that we have to study the billing data for the four different
Cloud Provider, stich them in a data warehouse in such a way that not only it tells a story
but is available at the most granular level possible.
▪ Since the AWS data is basically dependent on the Tags, and tags being dependent on the
usage, we had a hard time integrating those since AWS billing file's structure keep on
changing every day for given and last month, so in order to overcome that, we created
the logic to dynamically create the STG layer table every day and re-create them again
during the load as per the no of columns in the file.
▪ Also, this was the first time we are handling this much volume of data (200 million rows
per data load).
▪ Also, we were told to design this solution based on Talend, which is a completely new
tools for us where we had zero experience but, in the end, we not only delivered the
project in 6 months but also, we were told that we extracted the insights which are very
helpful for the VP and Upper management level people.

End result:
▪ The solution we created is not only loved by the VPs and CXOs of the client but also was
awarded with the Data Breakthrough's "Cross infrastructure Analytics Solution of the
Year"!
Client Name Deliverable:
America’s largest media and ▪ To create migrate the complete Talend ETL using AWS services on ETL framework and
Entertainment side by side migrating database from snowflake hosted on Azure to Snowflake hosted
conglomerate. in AWS in a SOW of 60 days.
Project Name
Talend to AWS Migration Approach:
▪ We decided to migrate the ETL and database simultaneously and with one ETL resource
Role and with one offshore architect.
ETL Pipeline Developer
Complexity:
Technology Stack ▪ The key challenges were that we have to again come up with an approach to dynamically
Talend, AWS, Snowflake, create a file, but now this time with the python and also pull the snowflake data for the
Python 6 different instances directly from the metadata and do ETL on top of that.
▪ We were also told that, during this migration project, we have to move away from the
existing snowflake instance on Azure, and migrate the complete Database for all the
three env to a different Snowflake instance hosted on AWS with all the historical data.
▪ During this time the team were hit with COVID and the complete offshore were shut
down for a week.

End result:
▪ We managed to deliver the project not on time but 5 days before our committed date and
a transition so smooth that the users didn't even get to know that something changed in the
backend.

Client Name Deliverable


America’s largest media and ▪ To provide the insights on how the employees are doing and how can they plan the
Entertainment Return to Work for all their employees without having a single incident of non-
conglomerate. compliance.
Project Name
Covid Analytics: Return to Complexity:
Office ▪ Integrating data from 10+ different sources at different levels including 3rd party vendors,
ERP metadata , pulling test results from vendors and various non-conventional sources.
Role
Approach:
ETL Pipeline Developer Under Covid Analytics, we designed a buffet of projects:
Technology Stack ▪ Badge Analysis where we show the badge swipe data for all of their employees across
AWS, Snowflake, Python 36+ countries. This data is PII and is encrypted using pgp encryption and completely
secure on the snowflake's business critical edition.
▪ Wave data, using the above badge data, the client decided to open offices in phased
manner with one wave at a time and very specific people and also tracking of these
people where they are swiping. This data is PII and is encrypted using pgp encryption and
completely secure on the snowflake's business critical edition.
▪ Test results: For all the employees coming, we have the Test result data from different
vendors like BioIQ , Genetworks CVS etc. which tracks the covid status of the employees
coming in the waves and also for the employees who are scheduled to come.
This data is PII & HIPPA and is encrypted using pgp encryption and completely
secure on the snowflake's business critical edition.
▪ Health Pass: Here we again screen the upcoming employees with the help of a vendor
called clear me, which issues passes at the time the employee is entering the
premises based on symptoms and heath declaration.
▪ Safety Compliance: this is the module where we tracked all the non-compliance if any
by any of the employee in the premises, like, not wearing masks, not maintaining social
distancing etc.
▪ Space Planning: Since now the covid is almost over and the most of the US is returning
back to office, this space planning data will be stitched with all the above data sets and
a report is created out of it which shows how the space is utilized of each and every
building.

End Result:
▪ So, in a nutshell, we provided the analysis each and every employee; When an employee
is scheduled to come in office in next wave, the executives and CXO's level people have
access to his Covid tests, his health, once he comes to office, where did he swiped, what
was his health status at the time he entered the premises, is he following the protocols
or not and how and where is the space utilized.
▪ This solution is only basis by which the client is not kicking off their return to office for
their 727k employees across 36 different countries.

Client Name Deliverable:


America’s largest media and ▪ We have to provide them the insights to CIOs of different BU on how their legacy systems
Entertainment are doing.
conglomerate.
▪ What's the usage? Who are the users? What the variance w.r.t to previous month, why
Project Name there was dip or shoot in the usage?
Legacy System
Decommissioning Analytics Complexity:
▪ The challenge here was that we have to connect to 8 different legacy/cloud systems in
Role order to extract the execution and users’ data from their metadata. It ranges from:
ETL Pipeline Developer
• Informatica hosted on Oracle
Technology Stack • Tableau using PostgreSQL
AWS, Snowflake, Python
• PowerBI using Splunk
• Snowflake using the Cloud Cost monitor
• Databricks using the Cloud Cost monitor
• Business Object hosted on Oracle
• MSTR hosted on Oracle
• Talend hosted on Azure with metadata on SQL Server

Approach:
▪ We created a data warehouse for the Operational Metrics review which contains the
data at lowest grain possible for the review for upper management.

End Result:
▪ As of now, this report is used by every CIOs of each BU for the media house to analyze
their legacy system usage and operational metrics.
▪ This report is also used by the VP and CEO to decide on the decommissioning of the
legacy systems.
Client Name Deliverable
America’s largest media and ▪ To provide the insights in the YouTube data for their two YouTube channels for the client
Entertainment ranging from a single metrics that tells how their YouTube is performing in terms of
conglomerate. revenue to details as granular as per video analysis.

Project Name Complexity:


YouTube Data Analytics ▪ Integrating data from different YouTube files which are accessible via API to provide the
data ranging from a single metrics that tells how their YouTube is performing in terms of
Role
revenue to details as granular as per video analysis.
ETL Pipeline Developer ▪ The major challenges we faced here is the volume of data, the volume was huge. The
Technology Stack current DB size after 1 year it is 1.5TB's of data in stored compressed format in snowflake
with more than 8billion rows in the main fact table.
AWS, Snowflake, Python
▪ Here we created the insights for the Director of the Content creation for the Client so he
can track how the videos are performing, which videos are viewed how many times, what
is the average drop out time for each video and other insights out of billions of rows.

Approach:
▪ We used a python script to download all the API files which is schedules every day on
Fargate cluster task which is pointing to a docker image as a supporting image and push
the files retrieved from API to our S3 buckets.

End Result:
▪ We created the complete ETL pipeline and now the reports based on this warehouse is
used by the Director of Social Engagements for the client to take the necessary action
and analyze the YouTube performance for their two channels.

You might also like