Pavani Senior Data Engineer Professional Summary

Uploaded by

ssreddy.data

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views6 pages

Pavani Senior Data Engineer Professional Summary

Uploaded by

ssreddy.data

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Pavani

Senior Data Engineer

PROFESSIONAL SUMMARY
 8 years of overall experience as Big Data Engineer, Data Analyst and ETL developer, comprises
designing powerbi, development and implementation of data models for enterprise - level
application.
 Good knowledge in Technologies on systems which comprises of massive amount of data running in
highly distributive mode in Cloudera, Hortonworks Hadoop distributions and Amazon AWS.
 Hands on experience in using Hadoop ecosystem components like Hadoop, Hive, Pig, Sqoop, HBase,
Cassandra, Spark, Spark Streaming, Spark SQL, Oozie, Zookeeper, Kafka, Flume, MapReduce
framework, Yarn, Scala and Hue.
338585338585338585338585338585338585338585338585338585338585338585338585338585338
585338585338585338585338585338585338585338585338585338585338585338585338585338585
338585338585338585338585338585338585338585338585338585338585338585338585338585338
585338585338585338585338585338585338585338585338585338585338585338585338585
 Extensive experience in working with NO SQL databases and its integration Dynamo DB, Cosmo DB,
Mongo DB, Cassandra and HBase.
 Good Knowledge on architecture and components of Spark, and efficient in working with Spark Core,
Spark SQL, Spark streaming and expertise in building PySpark and Spark-Scala applications for
interactive analysis, batch processing and stream processing.
 Experience in configuring Spark Streaming to receive real time data from the Apache Kafka and store
the stream data to HDFS and expertise in using spark-SQL with various data sources like JSON,
Parquet and Hive.
 Extensively used Spark Data Frames API over Cloudera platform to perform analytics on Hive data
and also used Spark Data Frame Operations to perform required Validations in the data.
 Proficient in Python Scripting and worked in stats function with NumPy, visualization using Matplotlib
and Pandas for organizing data.  Involved in loading the structured and semi structured data into
spark clusters using SparkSQL and DataFrames Application programming interface (API).
 Accomplished complex HiveQL queries for required data extraction from Hive tables and written Hive
User Defined Functions (UDF's) as required.
 Excellent knowledge in using Partitions, bucketing concepts in Hive and designed both Managed and
External tables in Hive to optimize performance.
 Proficient in converting Hive/SQL queries into Spark transformations using Data frames and Data
sets.
 Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage.
 Capable of using AWS utilities such as EMR, S3 and cloud watch to run and monitor Hadoop and
spark jobs on Amazon Web Services (AWS).
 Strong knowledge in working with Amazon EC2 to provide a complete solution for computing, query
processing, and storage across a wide range of applications.
 Capable in using Amazon S3 to support data transfer over SSL and the data gets encrypted
automatically once it is uploaded.
 Skilled in using Amazon Redshift to perform large scale database migrations.
 Ingested data into Snowflake cloud data warehouse using SnowPipe. Extensive experience in working
with micro batching to ingest millions of files on Snowflake cloud when files arrives to staging area.
 Worked in developing Impala scripts for extraction, transformation, loading of data into data
warehouse.
 Extensive knowledge in working with Azure cloud platform (HDInsight, DataLake, DataBricks, Blob
Storage, Data Factory, Synapse, SQL, SQL DB, DWH and Data Storage Explorer).
 Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL.
 Implement adhoc analysis solutions using Azure Data Lake Analytics/Store, HDInsight.
 Hands on Experience in using Visualization tools like Tableau, Power BI.
 Experience in importing and exporting the data using Sqoop from HDFS to Relational Database
Systems and from Relational Database Systems to HDFS.

TECHNICAL SKILLS:

Big Data Ecosystem HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Kafka, Spark
Flume, Cassandra, Impala, Oozie, Zookeeper, MapReduce, Amazon Web Services (AWS),
EMR
Cloud Technologies AWS, Azure, Google cloud platform (GCP)
IDE’s IntelliJ, Eclipse, Spyder, Jupyter.
Operating Systems: Linux, Unix, Windows 8, Windows 7, Windows Server 2008/2003
Programming Python, Scala, Linux shell scripts, Java Scripting, PL/SQL, Java, Pig Latin, HiveQL
languages
Databases Oracle, MySQL, DB2, MS-SQL Server, MongoDB, HBASE
Web Dev. Technologies HTML, XML, JSON, CSS, JQUERY, JavaScript
Java Technologies Core Java, Servlets, JSP, JDBC, Java Beans, J2EE
Business Tools Tableau, Power BI

PROFESSIONAL EXPERIENCE
Role: Senior AWS Data Engineer
Client: Fiserv, Brookfield, WI Jan 2023 to Present
Responsibilities:

 Implemented a 'serverless' architecture using API Gateway, Lambda, and Dynamo DB and deployed
AWS Lambda code from Amazon S3 buckets.
 Created a Lambda Deployment function, and configured it to receive events from your S3 bucket.
 Designed the data models to be used in data intensive AWS Lambda applications which are aimed to
do complex analysis creating analytical reports for end-to-end traceability, lineage, definition of Key
Business elements from Aurora.
 Writing code that optimizes performance of AWS services used by application teams and provide
Code level application security for clients (IAM roles, credentials, encryption, etc.).
 Using SonarQube for continuous inspection of code quality and to perform automatic reviews of
code to detect bugs. Managing AWS infrastructure and automation with CLI and API.
 Creating AWS Lambda functions using python for deployment management in AWS and designed,
investigated and implemented public facing websites on Amazon Web Services and integrated it with
other applications infrastructure.
 Creating different AWS Lambda functions and API Gateways, to submit data via API Gateway that is
accessible via Lambda function.
 Responsible for Building Cloud Formation templates for SNS, SQS, Elastic search, Dynamo DB,
Lambda, EC2, VPC, RDS, S3, IAM, Cloud Watch services implementation and integrated with Service
Catalog.
 Regular monitoring activities in Unix/Linux servers like Log verification, Server CPU usage, Memory
check, Load check, Disk space verification, to ensure the application availability and performance by
using cloud watch and AWS X-ray. implemented AWS X-Ray service inside Confidential, it allows
development teams to visually detect node and edge latency distribution directly from the service
map Tools.
 Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like
S3, ORC/Parquet/Text Files into AWS Redshift.
 Automate Datadog Dashboards with the stack through Terraform Scripts.
 Developed file cleaners using Python libraries and made it clean.
 Experience in building Snow pipe, In-depth knowledge of Data Sharing in Snowflake Database,
Schema and Table structures.
 Exploring DAG's, their dependencies and logs using Airflow pipelines for automation with a creative
approach.
 Designed and implemented a fully operational production grade largescale data solution on
Snowflake.
 Utilized Python Libraries like Boto3, NumPy for AWS.
 Used Amazon EMR for map reduction jobs and test locally using Jenkins.
 Data Extraction, aggregations and consolidation of Adobe data within AWS Glue using PySpark.
 Create external tables with partitions using Hive, AWS Athena and Redshift.
 Developed the PySprak code for AWS Glue jobs and for EMR. Install and configured Splunk clustered
search head and Indexer, Deployment servers, Deployers.  Designing and implementing Splunk -
based best practice solutions. Designed and Developed ETL jobs to extract data from Salesforce
replica and load it in data mart in Redshift.
 Responsible for Designing Logical and Physical data modelling for various data sources on
Confidential Redshift.
 Experienced with event-driven and scheduled AWS Lambda functions to trigger various AWS
resources.
 Integrated lambda with SQS and DynamoDB with step functions to iterate through list of messages
and updated the status into DynamoDB table.
Technologies: Python, Power BI, AWS Glue, Athena, SSRS, SSIS, AWS S3, AWS Redshift, AWS EMR, AWS
RDS, DynamoDB, SQL, Tableau, Distributed Computing, Snowflake, Spark, Kafka, MongoDB, Hadoop, Linux
Command Line, Data structures, PySpark, Oozie, HDFS, MapReduce, Cloudera, HBase, Hive, Pig, Docker,
Tableau.

Role: Senior Azure Data Engineer

Client: American Airlines, Fort Worth, TX Jan 2021 to Aug
2022
Responsibilities:
 Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of
data.
 Understand current Production state of application and determine the impact of new
implementation on existing business processes.
 Extract Transform and Load data from Sources Systems to Azure Data Storage services using a
combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics.
 Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure
DW) and processing the data in In Azure Databricks.
 Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load
data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool
and backwards.
 Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and
aggregation from multiple file formats for analyzing & transforming the data to uncover insights into
the customer usage patterns.
 Experience in building Snow pipe, In-depth knowledge of Data Sharing in Snowflake Database,
Schema and Table structures.
 Exploring DAG's, their dependencies and logs using Airflow pipelines for automation with a creative
approach.
 Designed and implemented a fully operational production grade largescale data solution on
Snowflake.
 Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark databricks
cluster.
 Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct
level of Parallelism and memory tuning.
 To meet specific business requirements wrote UDF’s in Scala and Pyspark. Developed JSON Scripts
for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Sql Activity.
 Hands-on experience on developing SQL Scripts for automation purpose. Created Build and Release
for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS).

Technologies: PL/SQL, Python, Azure-Data factory, Azure Blob storage, Azure table storage, Azure SQL
server, Apache Hive, Apache Spark, MDM, Netezza, Teradata, Oracle 12c, SQL Server, Teradata SQL
Assistant, Teradata Vantage, Microsoft Word/Excel, Flask, Snowflake, DynamoDB, Athena, Lambda,
MongoDB, Pig, Sqoop, Tableau, Power BI and UNIX, Docker, Kubernetes.
Client: Honeywell, India Apr 2017 to Dec 2020
Role: Data Engineer
Responsibilities:
 Experienced in building and architecting multiple Data pipelines, end to end ETL and ELT
process for Data Ingestion and transformation in AWS and Spark.
 Leveraged cloud and GPU computing technologies for automated machine learning and analytics
pipelines, such as AWS
 Participated in all phases of data mining; data collection, data cleaning, developing models,
validation, visualization, performed Gap analysis provide feedback to the business team to
improve the software delivery.
 Data Mining with large datasets of Structured and Unstructured data, Data Acquisition, Data
Validation, Predictive modeling, Data Visualization on provider, member, claims, and service
fund data.
 Involved in Developing a RESTful API's (Microservices) using Python Flask framework that is
packaged in Docker and deployed in Kubernetes using Jenkins Pipelines.
 Experience in building and architecting multiple Data pipelines, end to end ETL, and ELT processes
for Data ingestion and transformation in Pyspark.
 Created reusable Rest API's that exposed data blended from a variety of data sources by reliably
gathering requirements from businesses directly.
 Worked on the development of Data Warehouse, a Business Intelligence architecture that
involves data integration and the conversion of data from multiple sources and platforms.
 Responsible for full data loads from production to AWS Redshift staging environment and
worked on migrating EDW to AWS using EMR and various other technologies.
 Experience in Creating, Scheduling, and Debugging Spark jobs using Python. Performed Data
Analysis, Data Migration, Transformation, Integration, Data Import, and Data Export through
Python.
 Gathered and processed raw data at scale (including writing scripts, web scraping, calling
APIs, writing SQL queries, and writing applications).
 Creating reusable Python scripts to ensure data integrity between the source
(Teradata/Oracle) and target system.
 Migrated on-premise database structure to Confidential Redshift data warehouse.
 Created data pipelines for different events to load the data from DynamoDB to AWS S3 bucket
and then into HDFS and delivered high success metrics.
 Implemented for authoring, scheduling, and monitoring Data Pipelines using Scala and spark.
 Developed and designed system to collect data from multiple platforms using Kafka and then
process it using spark.
 Created modules for spark streaming in data into Data Lake using Spark and Worked with
different feeds data like JSON, CSV, XML and implemented Data Lake concept.
Technologies: Python, Power BI, AWS Glue, Athena, SSIS, AWS S3, AWS Redshift, AWS EMR, AWS RDS,
DynamoDB, SQL, AWS Lambda, Scala, Spark
Role: Hadoop Developer
Client: Soft labs Group, India Aug 2015 to Mar 2017
Responsibilities:
 Worked closely with business, transforming business requirements to technical requirements part of
Design Reviews & Daily project scrums and wrote custom MapReduce programs by writing Custom
input formats.
 Created Sqoop jobs with incremental load to populate Hive External tables.
 Involved in the development of real time streaming applications using PySpark, Kafka on distributed
Hadoop Cluster.
 Worked on Partitioning, Bucketing, Join Optimizations, and query optimizations in Hive.
 Worked closely with business, transforming business requirements to technical requirements.
 Designed and developed Hadoop MapReduce programs and algorithms for analysis of cloud scale
classified data stored in Cassandra.
 Optimized the Hive tables using optimization techniques like partitioning and bucketing to provide
better performance with HiveQL queries.
 Evaluated data import-export capabilities, data analysis performance of Apache Hadoop framework.
 Involved in installation of HDP Hadoop, configuration of the cluster and the eco system components
like Sqoop, Pig, Hive, HBase and Oozie.
 Created reports for BI team using Sqoop to export data into HDFS and Hive.
 Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational
Database system and vice versa.
 Created RDD’s in Spark technology.
 Extracted data from data warehouse (Tera Data) on the spark RDD’s.
 Experience on Spark with Scala/Python.
 Working on stateful Transformation in Spark streaming.
 Worked on Batch processing and Real-time data processing and Spark Streaming using Lambda
architecture.
 Worked on Spark SQL UDF’s and Hive UDF’s.

Technologies: Spark, Kafka, Hadoop, Linux Command Line, Data structures, PySpark, Oozie, HDFS,
MapReduce, Cloudera, HBase, Hive, Pig, Docker

Saikiran Data - Engineer Resume
No ratings yet
Saikiran Data - Engineer Resume
7 pages
Teja
No ratings yet
Teja
5 pages
Deepak (Sr. Data Engineer)
No ratings yet
Deepak (Sr. Data Engineer)
10 pages
Srilakshi M Resume
No ratings yet
Srilakshi M Resume
6 pages
IRC5 Basic Operations Student Manual Rev3 Slideshow
No ratings yet
IRC5 Basic Operations Student Manual Rev3 Slideshow
105 pages
Azure Data Engineer - Samatha Gudala
100% (1)
Azure Data Engineer - Samatha Gudala
8 pages
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
No ratings yet
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
7 pages
Operating System Notes
100% (1)
Operating System Notes
61 pages
Naveen's Resume - AWS DE
No ratings yet
Naveen's Resume - AWS DE
5 pages
Ankit Data Engineer Resume
No ratings yet
Ankit Data Engineer Resume
8 pages
RAJU AWS Data Engineer Resume
No ratings yet
RAJU AWS Data Engineer Resume
6 pages
Sai Vodnala DE
No ratings yet
Sai Vodnala DE
5 pages
Sandeep Julakanti - Resume
No ratings yet
Sandeep Julakanti - Resume
9 pages
Ravi Teja AWS Data Engineer
No ratings yet
Ravi Teja AWS Data Engineer
8 pages
SreeDEResume AWS
No ratings yet
SreeDEResume AWS
5 pages
Prashanth Snowflake Data Engg
No ratings yet
Prashanth Snowflake Data Engg
5 pages
SumanaV Bigdata
No ratings yet
SumanaV Bigdata
6 pages
Sharath Res
No ratings yet
Sharath Res
7 pages
Eliassen Resume - Anup Somavarapu
No ratings yet
Eliassen Resume - Anup Somavarapu
5 pages
Anvesh - Sr. Data Engineer
No ratings yet
Anvesh - Sr. Data Engineer
6 pages
Rakesh Data Engineer
No ratings yet
Rakesh Data Engineer
8 pages
Vandana Resume
No ratings yet
Vandana Resume
6 pages
Summary: 12 Years
No ratings yet
Summary: 12 Years
7 pages
Dice Resume CV Saumya S
No ratings yet
Dice Resume CV Saumya S
7 pages
Nikhil Kumar Mutyala - Senior Big Data Engineer
No ratings yet
Nikhil Kumar Mutyala - Senior Big Data Engineer
7 pages
Mucharla Shiva Kumar Goud - Leaddata Engineer
No ratings yet
Mucharla Shiva Kumar Goud - Leaddata Engineer
5 pages
Resume 1
No ratings yet
Resume 1
7 pages
Santosh Goud - Senior AWS Big Data Engineer
No ratings yet
Santosh Goud - Senior AWS Big Data Engineer
9 pages
Chandralekha Rao Yachamaneni
No ratings yet
Chandralekha Rao Yachamaneni
7 pages
Anil Kumar: Data Engineer
No ratings yet
Anil Kumar: Data Engineer
8 pages
Mathisha Jeeva
No ratings yet
Mathisha Jeeva
6 pages
Mahesh - Big Data Engineer
No ratings yet
Mahesh - Big Data Engineer
5 pages
DataEngineer Shreya AWS
No ratings yet
DataEngineer Shreya AWS
9 pages
Abhilash Resume
No ratings yet
Abhilash Resume
5 pages
Anil SrDEngineer
No ratings yet
Anil SrDEngineer
5 pages
Aslam Big Data Engineer
No ratings yet
Aslam Big Data Engineer
6 pages
John Pual
No ratings yet
John Pual
10 pages
Shiva DE Resume
No ratings yet
Shiva DE Resume
6 pages
Rajesh DataEngineer
No ratings yet
Rajesh DataEngineer
7 pages
Shiva Data - Resume
No ratings yet
Shiva Data - Resume
6 pages
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
No ratings yet
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
6 pages
Praveena Reddy Senior Data Engineer Resume
No ratings yet
Praveena Reddy Senior Data Engineer Resume
6 pages
Manideep Resume IXL
No ratings yet
Manideep Resume IXL
9 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
Dice Resume CV SAI KARTHIK
No ratings yet
Dice Resume CV SAI KARTHIK
4 pages
Sai Kruthik Reddy Data Engineer
No ratings yet
Sai Kruthik Reddy Data Engineer
9 pages
Ravi Shankar Chittela DataEngg
No ratings yet
Ravi Shankar Chittela DataEngg
10 pages
Nagaraju Bachu
No ratings yet
Nagaraju Bachu
6 pages
Swapnik DE
No ratings yet
Swapnik DE
6 pages
Sri 3
No ratings yet
Sri 3
8 pages
Resume 3
No ratings yet
Resume 3
7 pages
Shiva Shameen Karri
No ratings yet
Shiva Shameen Karri
6 pages
RajithaK Data Engineer Resume
No ratings yet
RajithaK Data Engineer Resume
12 pages
Enabling High Reliability and Low Maintenance For Querying Costs
No ratings yet
Enabling High Reliability and Low Maintenance For Querying Costs
6 pages
Ajay Data Engineer Resume
No ratings yet
Ajay Data Engineer Resume
6 pages
Naresh DE
No ratings yet
Naresh DE
5 pages
Rohith DE
No ratings yet
Rohith DE
7 pages
Data Analyst 3
No ratings yet
Data Analyst 3
5 pages
1
No ratings yet
1
6 pages
Gautham - Data Engineer
No ratings yet
Gautham - Data Engineer
6 pages
Sai Sreekar P
No ratings yet
Sai Sreekar P
3 pages
CCTV Textbook
No ratings yet
CCTV Textbook
90 pages
Pptwaste
No ratings yet
Pptwaste
19 pages
Magellan 1100i
No ratings yet
Magellan 1100i
2 pages
Aws (S3, Iam, Ec2, Emr and Redshift)
100% (1)
Aws (S3, Iam, Ec2, Emr and Redshift)
16 pages
Elster EK205: I Connected Industrial
No ratings yet
Elster EK205: I Connected Industrial
2 pages
ADF Desktop Integration
No ratings yet
ADF Desktop Integration
52 pages
Expt - 2 - Verification of Theveninen's Theorem For Ac Circuits
No ratings yet
Expt - 2 - Verification of Theveninen's Theorem For Ac Circuits
3 pages
Nomad Training
No ratings yet
Nomad Training
49 pages
Bmc-Block 7,8,9
No ratings yet
Bmc-Block 7,8,9
7 pages
Read-Coop Sce: How To Use Transkribus - in 10 Steps (Or Less)
No ratings yet
Read-Coop Sce: How To Use Transkribus - in 10 Steps (Or Less)
11 pages
7 Leading Machine Learning Use Cases
No ratings yet
7 Leading Machine Learning Use Cases
11 pages
Woot16 Paper Grothe
No ratings yet
Woot16 Paper Grothe
14 pages
FU9000S Manual
0% (1)
FU9000S Manual
60 pages
CV of Mudassar CV
100% (1)
CV of Mudassar CV
2 pages
Jurnal Ieee
No ratings yet
Jurnal Ieee
496 pages
Product Name Wireless FHD Kit
No ratings yet
Product Name Wireless FHD Kit
3 pages
Https:/gnsu1.Ucanapply - Com/student/form Preview Stu/eyJpdiI6IjU5eElpVFZE
No ratings yet
Https:/gnsu1.Ucanapply - Com/student/form Preview Stu/eyJpdiI6IjU5eElpVFZE
1 page
Hitachi VSP g200 Hardware Installation Guide 04062015 PDF
100% (1)
Hitachi VSP g200 Hardware Installation Guide 04062015 PDF
274 pages
Interview Questions-Answer BA
No ratings yet
Interview Questions-Answer BA
27 pages
Ewp I Practical
No ratings yet
Ewp I Practical
2 pages
Audiosphere Catalog
No ratings yet
Audiosphere Catalog
194 pages
Rotary Drum Filter: Chemical Engineering A/21
No ratings yet
Rotary Drum Filter: Chemical Engineering A/21
12 pages
Clam Basic Series Operators Manual
No ratings yet
Clam Basic Series Operators Manual
16 pages
Veriflex Intercon 1.8-3kV Cable - 1
No ratings yet
Veriflex Intercon 1.8-3kV Cable - 1
2 pages
Eng - Nadia Mahmoud CV (Curriculum Vitae)
No ratings yet
Eng - Nadia Mahmoud CV (Curriculum Vitae)
2 pages
Australian Standard: Windows in Buildings - Selection and Installation
0% (3)
Australian Standard: Windows in Buildings - Selection and Installation
8 pages
RT-PRC060M-EN - 06062020 25ton
No ratings yet
RT-PRC060M-EN - 06062020 25ton
64 pages
Shivangi Resume
No ratings yet
Shivangi Resume
1 page

Pavani Senior Data Engineer Professional Summary

Uploaded by

Pavani Senior Data Engineer Professional Summary

Uploaded by

Pavani

Senior Data Engineer

Role: Senior Azure Data Engineer

You might also like