Resume 2

The document outlines a professional summary of an IT expert with 9 years of experience in Big Data technologies, cloud services, and data engineering. It details their technical skills in various tools and platforms, including Hadoop, Azure, AWS, and Snowflake, as well as their work experience across multiple companies, focusing on data pipeline development, ETL processes, and data analytics. The individual has a strong background in programming, machine learning, and reporting tools, showcasing their ability to handle complex data tasks and improve data management efficiency.

Uploaded by

itconsultantus10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views4 pages

Resume 2

Uploaded by

itconsultantus10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

PROFESSIONAL SUMMERY:-

 9 years of demonstrated experience in IT industry with expert-level skills in Big Data Hadoop
Ecosystem, Apache Spark, PySpark, Scala, Python, Kafka, Data Warehousing, Data Pipeline,
Business Intelligence, Snowflake, Data Analytics.
 Proficient in Azure – Data Lakes, Amazon Web Services (AWS) – EC2, S3, EMR, ETL, Informatica, Google
cloud Platform (GCP) Glue and Presto and Data bricks.
 Expert level knowledge on Hadoop Distributed File System (HDFS) architecture and YARN.
 Experienced on Hive partitioning, bucketing, optimization code through set parameters and perform
different types of joins on Hive tables and implement Hive SerDe like Avro, JSON.
 Extensively utilized Hive for Processing and Analyzing logs, joining large tables, Batch Jobs, HiveQL for
Ad-hoc interactive queries to summarize and analyze large data sets.
 Experienced in importing and exporting data from different databases like SQL Server, Oracle,
Teradata and Netezza.
 Extensive experience in leveraging data serialization formats like AVRO, Protocol Buffer and
Columnar formats like RCFile, ORC and PARQUET file formats.
 Experienced in automating Oozie workflows and Job Controllers for job automation – Shell, Hive,
Sqoop and email notifications.
 Experience in Zookeeper configuration as to provide cluster coordination services.
 Extensive experience in creating RDD’s and Data Sets in Spark from local file system and HDFS.
 Hands on experience in writing different RDD (Resilient Distributed Datasets) transformations and
actions using Scala.
 Experience in analyzing data using R, SQL, Microsoft Excel, Have, Pyspark, Spark SQL for data Mining,
Data cleansing and machine learning.
 Experience in Extraction, Transformation and Loading (ETL) of data from multiple sources like Flat
files and Databases.
 Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators.
 Created Data Frames and performed analysis using Spark SQL and used RDD and DF APIs to access
variety of data sources using Scala, PySpark, Pandas and Python.
 Excellent knowledge on Spark core architecture.
 Created ERM transient and long running clusters and in AWS for data processing (ETL) and log
analysis.
 Deployed various Hadoop applications in EMR - Hadoop, Hive, Spark, HBase, Hue, HCatalog, Glue,
Oozie and Presto etc. based on the needs.
 Experience in integrating Hive with AWS S3 to read and write data from and to S3 and created
partitions in Hive .
 Extensively worded on ETL/Data pipelines to transform data and load from AWS S3 to Snowflake or
viceversa.
 Extensively utilized EMRFS (Elastic Map Reduce File System) for reading and writing SerDe data
between HDFS and EMRFS.
 Experience in using Presto on EMR to query different types of data sources RDBMS and NoSQL
databases.
 Experience in creating ad-hoc reporting, development of data visualizations using enterprise
reporting tools like Tableau, Power BI, Business Objects and Alteryx.

TECHNICAL SKILLS-
Big Data Tools : Hadoop, Hive, Apache Spark, PySpark, HBase, Kafka, Pig, Map Reduce, Zookeeper
and Flume.
 Cloud Technologies : Azure (Data Bricks, Azure Data Factory, Azure Data Lake, Azure pipelines, Azure
Functions, Blob storage)AWS (EC2, S3 Bucket, Amazon Redshift, Lambda, IAM, Kinesis), Snowflake,
GCP(Big Query, Cloud SQL, Cloud Storage, Cloud SDK, Cloud APIs, and other tools like Dataflow, Dataproc,
Data prep, Data Studio).
 ETL Tools : SSIS, DBT, Informatica.
 Relational Databases : MS SQL Server, MySQL, Oracle, PostgreSQL, Netezza.
 No SQL Databases :Cassandra, MongoDB, HBase
 Programming Language: Python, R, Scala, JSON, HTML.
 Scripting : Python, Shell scripting
 IDE’s : PyCharm, Jupiter Notebook.
 Build Tools : Apache Maven and SBT, Jenkins, Bitbucket
 Version Control :GIT, SVN
 CI/CD : Jenkins, Azure .
 Machine Learning : Linear Regression, Logistic Regression, Decision Tree, SVM, KNN, K mean.
 Packages : NumPy, Pandas, Matplotlib, Scikit-learn, Seaborn, PySpark.
 Reporting Tools : Tableau, Power BI, SSRS.
 Operating Systems : Windows, Linux, MacOS.

WORK EXPERIENCE

Fannie Mae, Plano, Texas||Azure Data Engineer Aug 2023 – Till Date

Project Descriptions: In this Project, Client Application provides the customer details experience through real
time connecting with customers in various ways. The objective is to serve the customer more efficiently by
conducting user research and usability studies to understand how customers are interacting, utilizing with the
application, client services and building a highly scalable, high-availability, and high performance.

Responsibilities:
 Develop standardized Azure Data Factory pipelines for ingesting diverse data sources into Azure Data
Lake Storage (ADLS).
 Establish metadata framework within Azure Data Factory for improved data management and
organization.
 Parameterize linked services and datasets in Azure Data Factory to enhance pipeline flexibility and
reusability.
 Utilize Databricks notebooks with PySpark to register and transform raw data into structured formats.
 Write SparkSQL transformations in Databricks notebooks to facilitate data movement across different
layers in ADLS and Databases.
 Implement automation workflows using Azure Logic Apps and Automation Runbooks for efficient task
management.
 Contribute to SQL Server database development and optimization.
 Perform performance tuning and query optimizations to enhance database efficiency.
 Involved in Snowflake data warehouse migration, managing External Stages, Tables, Stored Procedures,
and Views.
 Utilized SnowSQL & Snowpark for Data transformations within Snowflake Data warehouse.
 Employed Snowpipe for real time data ingestions from Data Lake into Snowflake Data warehouse.
 Provide hands-on training and mentorship to onboard new team members, fostering knowledge
sharing.
 Collaborate closely with users to identify and resolve issues, ensuring enhanced user experience.
 Experienced in writing real-time processing and core jobs using Spark Streaming with Kafka as a data
pipeline system.
 Migration of on-premises data (SQL Server / MongoDB) to Azure Data Lake Store (ADLS) using Azure
Data Factory (ADF V1/V2).

Environment: Azure Data factory, Azure Databricks, Azure SQL, Synapse, Data Lake, Snowflake Data
Warehouse, Kafka, MS SQL Server, SSRS, SQL Server Integration Services (SSIS), Microsoft Visual Studio, SQL
Server Management Studio, Jenkins, PL/SQL, T-SQL, Spark eco-system, Pyspark, Big Data, Agile methodology,
Agile safe, Kanban, Scrum.

Change Healthcare, Lombard, IL || Data Engineer Dec 2022 – July 2023

Project Descriptions: Project majorly focuses on expanding and optimizing data and data pipeline architecture,
as well as building and maintaining data workflow, designing optimal ETL data pipeline and infrastructure
required for optimal extraction, transformation, and loading of data from a wide variety of data sources. As a
Data Engineer, involved in maintaining the huge data and designing developing predictive data models for
business users according to the requirement.
Responsibilities:
 Handled importing of data from various data sources, performed data control checks using PySpark and
loaded data into HDFS.
 Developed python scripts, UDFs using both Data frames/SQL/Data sets and RDD/Kafka in Spark 1.6 for data
Aggregation, queries and writing data back into OLTP system through Sqoop.
 Experienced in handling large datasets using Partitions, PySpark in Memory capabilities, Broadcasts in
PySpark, effective & efficient Joins, Transformations and other during ingestion process itself.
 Involved in writing live Real-time Processing and core jobs using Spark Streaming with Kafka as a data pipe-
line system.
 Implemented robust error handling and retry mechanisms within Lambda functions, ensuring fault
tolerance and reliability of serverless applications.
 Developed and maintained IAM policies and security configurations to enforce compliance with industry
standards (such as PCI DSS, HIPAA, or GDPR) and internal security policies, ensuring data confidentiality
and integrity.
 Used Spark API over Cloudera Hadoop YARN to perform analytics on data in HDFS.
 Streaming pipeline that uses PySpark to read data from Kafka, transform it and write it to HDFS.
 Designed Data Marts by following Star Schema and Snowflake Schema Methodology, using Data Modeling
tools.
 Worked on Snowflake database on queries and writing Stored Procedures for normalization.
 Worked with Snowflake’s stored procedures, used procedures with corresponding DDL statements, used
JavaScript API to easily wrap and execute numerous SQL queries.
 Involved in performing unit testing and integration testing.
 I have substantial work experience in software projects development life cycle, utilizing the core principles
of Agile methodologies.
 I have experience in closely collaborating with offshore development and production support teams.
 I have experience collaborating closely with offshore development and production support teams, with
responsibilities that include gathering daily status updates and conveying them to senior leadership.

Environment: AWS Glue, Snowflake, HDFS, Hive, Kafka, Spark 1.8, Linux, Python 2, SQL Server Database, Jira,
Service Now, Confluence, Agile methodologies (SCRUM Framework), AWS (EC2, S3, EMR, Lambda, Step
Function).

RYAN, Hyderabad, INDIA || GCP Data Engineer Oct 2018 – July 2021

Project Description: The project main center is optimizing data management, processing, and analysis to improve
the efficiency and accuracy of tax related operations. To enhance the company’s services, data infrastructure and
capabilities to streamline tax related processes, ensure compliance with regulatory requirements, and provide
better insights for clients.
Responsibilities:
 Developed multiple data pipelines using cloud services, worked on Map reduce for data distribution to
reduce the data load.
 Hands on experience in Google Cloud as Cloud Storage, Dataflow, Cloud Composer, Bigquery, Cloud
Functions, Cloud Pub/sub and Dataproc.
 Experience in IBM Console for monitoring of streaming jobs.
 Experience in testing the data through streaming jobs for Events and Outages.
 Writing Python scripts to load the data from Bigquery to Bigquery using Dataflow and Composer.
 Experience in data validation and analysis for Prod defects.
 Running Cron Jobs using the Omega Data to GCP and checking the logs in Omega.
 Experienced in GCP Jobs and Egress jobs testing once the migration is done.
 Experience in writing and creating Hive tables in Omega and data validation.
 Working experience with Support team and took the responsibility for the issues in production.
 Experience in solving priority issues and involving in SOC calls while there is any production issues.
 Handling all priority incidents created by the end users and providing the solution on time via Service
Now.
 Experience in creating Teradata scripts, pyspark scripts to load the data from Teradata tables to
Hadoop.
 Responsible for L2 support for environment and application related issues.
 Handling Change Requests and Service Requests.
 Troubleshooting production issues under client defined SLA’s.
 Responsible for Google Production Support for environments and application related issues.
 Service Now applications implemented: Incidents, Change Requests, Service Requests, Configuration,
Dashboards.
 Involving with business teams and different teams to solve the critical issues.

Environment: GCP, Cloud SQL, Big Query, Cloud DataProc, GCS, Cloud SQL, Cloud Composer, Hadoop, Hive,
Map Reduce,Teradata, SAS, Teradata, Spark, Python, SQL Server, Service Now, Confluence, IBM Console.

Ceequence Technologies Hyderabad, India || Jr. Data Engineer Aug 2014 – Sep 2018

Project Description: The primary goal of the project is to focus on the collection, integration, and analysis of data
from different sources. This focus will result in greater insights and more effective support for decision making.
creating Corporate Data Warehouse and migrating data from the OLTP systems to the Corporate Data Warehouse.
SSIS was used as an ETL tool for extracting the Data from various sources running on Oracle, DB2 and MS SQL
Server databases and Generate reports on Power BI to cover weekly, monthly, quarterly and annual historic
information.

Responsibilities:
 Created action filters, parameters, and calculated sets for preparing dashboards and worksheets using
Power BI.
 Developed Snowflake views to load and unload data from and to an AWS S3 bucket, as well as
transferring the code to production.
 Developed visualizations and dashboards using Power BI.
 Performing ETL testing activities like running the Jobs, Extracting the data using necessary queries
from database transform, and upload into the Data warehouse servers.
 Created dashboards for analyzing POS data using Power BI.
 Converting Hive/SQL queries into Spark transformations using Spark data frames, Scala Python.
 Running Spark SQL operations on JSON, converting the data into a tabular structure with data frames,
and storing and writing the data to Hive and HDFS.
 Developing shell scripts for data ingestion and validation with different parameters, as well as writing
custom shell scripts to invoke spark Employment.
 Tuned performance of Informatica mappings and sessions for improving the process and making it
efficient after eliminating bottlenecks.
 Worked on complex SQL Queries, Cassandra procedures and convert them to ETL tasks.
 Worked with PowerShell and UNIX scripts for file transfer, emailing and other file related tasks.
 Created a risk-based machine learning model (logistic regress, random forest, SVM, etc.) to predict
which customers are more likely to be delinquent based on historical performance data and rank order
them.

Environment: MS-SQL Server, SQL Server Integration Services (SSIS), Import and Export Data wizard,
TFS, SQL Server Reporting Services, Power BI, SQL Server Analysis Services (SSAS),SQL Profiler, Python
3.0, SSIS, Spark.

Vandana Resume
No ratings yet
Vandana Resume
6 pages
John Pual
No ratings yet
John Pual
10 pages
Jagan Mohan Kanimetta Data Engineer
No ratings yet
Jagan Mohan Kanimetta Data Engineer
5 pages
Satya Sandeep - Data Engineer Resume
No ratings yet
Satya Sandeep - Data Engineer Resume
8 pages
Sahithi Devi
No ratings yet
Sahithi Devi
10 pages
Farhan Data Engineer
No ratings yet
Farhan Data Engineer
9 pages
Azure Data Engineer - Samatha Gudala
100% (1)
Azure Data Engineer - Samatha Gudala
8 pages
Abdul Hameed Mohamed
No ratings yet
Abdul Hameed Mohamed
7 pages
SSREDDY
No ratings yet
SSREDDY
8 pages
Avinash - Data Engineer (AutoRecovered)
No ratings yet
Avinash - Data Engineer (AutoRecovered)
10 pages
Bharath DE
No ratings yet
Bharath DE
7 pages
Sai Kruthik Reddy Data Engineer
No ratings yet
Sai Kruthik Reddy Data Engineer
9 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
Venkata Sai (Sr. GCP Data Engineer)
No ratings yet
Venkata Sai (Sr. GCP Data Engineer)
7 pages
Sai Krishna Sr. Big Data Engineer
No ratings yet
Sai Krishna Sr. Big Data Engineer
8 pages
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
No ratings yet
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
4 pages
Ankush Kaira
No ratings yet
Ankush Kaira
6 pages
Mir Shezan Data Analyst Resume
No ratings yet
Mir Shezan Data Analyst Resume
3 pages
Ajay Resume
No ratings yet
Ajay Resume
3 pages
Ravi Shankar Chittela DataEngg
No ratings yet
Ravi Shankar Chittela DataEngg
10 pages
1
No ratings yet
1
6 pages
Jyostna DataEngineer GCEAD
No ratings yet
Jyostna DataEngineer GCEAD
5 pages
Sai Charan de
No ratings yet
Sai Charan de
9 pages
Resume 3
No ratings yet
Resume 3
7 pages
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
No ratings yet
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
6 pages
Ankit Data Engineer Resume
No ratings yet
Ankit Data Engineer Resume
8 pages
SumanaV Bigdata
No ratings yet
SumanaV Bigdata
6 pages
Resume 1
No ratings yet
Resume 1
7 pages
Nikhil Kumar Mutyala - Senior Big Data Engineer
No ratings yet
Nikhil Kumar Mutyala - Senior Big Data Engineer
7 pages
Sai Vodnala DE
No ratings yet
Sai Vodnala DE
5 pages
Aditya Paruchuri
No ratings yet
Aditya Paruchuri
7 pages
Minakshi Kesarwani Resume
No ratings yet
Minakshi Kesarwani Resume
5 pages
Dice Resume CV Sailaja Reddy
No ratings yet
Dice Resume CV Sailaja Reddy
6 pages
Mathisha Jeeva
No ratings yet
Mathisha Jeeva
6 pages
Resume Data Engineer
No ratings yet
Resume Data Engineer
8 pages
LekhyaJ SrDE Resume
No ratings yet
LekhyaJ SrDE Resume
5 pages
Hruthik Reddy - Senior Data Engineer
No ratings yet
Hruthik Reddy - Senior Data Engineer
4 pages
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
No ratings yet
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
7 pages
JPC - 15553 - Bhavyasri Tanneeru
No ratings yet
JPC - 15553 - Bhavyasri Tanneeru
8 pages
Naresh DE
No ratings yet
Naresh DE
5 pages
Anvesh - Sr. Data Engineer
No ratings yet
Anvesh - Sr. Data Engineer
6 pages
Swetha G
No ratings yet
Swetha G
9 pages
Vishnu DE
No ratings yet
Vishnu DE
4 pages
DataEngineer Shreya Hadoop
No ratings yet
DataEngineer Shreya Hadoop
9 pages
DataEngineer Shreya AWS
No ratings yet
DataEngineer Shreya AWS
9 pages
Nagaraju Bachu
No ratings yet
Nagaraju Bachu
6 pages
Saikiran Data - Engineer Resume
No ratings yet
Saikiran Data - Engineer Resume
7 pages
Data Analyst 4
No ratings yet
Data Analyst 4
10 pages
Bharath Sai K DataEngineer
No ratings yet
Bharath Sai K DataEngineer
6 pages
Enabling High Reliability and Low Maintenance For Querying Costs
No ratings yet
Enabling High Reliability and Low Maintenance For Querying Costs
6 pages
Harsh - Data Engineer
No ratings yet
Harsh - Data Engineer
8 pages
Ravali Data Engineer GCP
No ratings yet
Ravali Data Engineer GCP
8 pages
Maneesh Azure
No ratings yet
Maneesh Azure
6 pages
Deepak (Sr. Data Engineer)
No ratings yet
Deepak (Sr. Data Engineer)
10 pages
Adithya Jatangi: Professional Summary
No ratings yet
Adithya Jatangi: Professional Summary
7 pages
Dice Resume CV SN
No ratings yet
Dice Resume CV SN
5 pages
Kanishk Resume
No ratings yet
Kanishk Resume
5 pages
ABHINAY VARMA PINNAMARAJU - Data Engineering
No ratings yet
ABHINAY VARMA PINNAMARAJU - Data Engineering
6 pages
Mobile: 9000403491: Venkatasuresh Veginati
No ratings yet
Mobile: 9000403491: Venkatasuresh Veginati
2 pages
Lab - Performing ETL On A Dataset by Using AWS Glue
100% (1)
Lab - Performing ETL On A Dataset by Using AWS Glue
26 pages
Oracle Wait Events
No ratings yet
Oracle Wait Events
9 pages
A Survey of Distributed Query Optimization
No ratings yet
A Survey of Distributed Query Optimization
10 pages
@@database (SQL)
No ratings yet
@@database (SQL)
82 pages
G12CS-QP - PB2
No ratings yet
G12CS-QP - PB2
11 pages
Ooad Unit 2 Notes
No ratings yet
Ooad Unit 2 Notes
27 pages
Sample 1
No ratings yet
Sample 1
4 pages
DATAcated Visualization Tools Guide 2021 - Update FEB
No ratings yet
DATAcated Visualization Tools Guide 2021 - Update FEB
25 pages
Sample
No ratings yet
Sample
3 pages
SQL Server 2008: DDL (Create/ Alter/ Drop/ Truncate)
No ratings yet
SQL Server 2008: DDL (Create/ Alter/ Drop/ Truncate)
66 pages
MCQ - Hadoop - Javaguides
No ratings yet
MCQ - Hadoop - Javaguides
3 pages
Tablespace Management Theory 30092022
No ratings yet
Tablespace Management Theory 30092022
11 pages
Sample 3
No ratings yet
Sample 3
5 pages
DBMS GTU Study Material Presentations Unit-7 31102020032310AM
No ratings yet
DBMS GTU Study Material Presentations Unit-7 31102020032310AM
78 pages
Syllabus of Django Online Training Course
No ratings yet
Syllabus of Django Online Training Course
3 pages
Summary: 12 Years
No ratings yet
Summary: 12 Years
7 pages
Databases Sde Sheet
No ratings yet
Databases Sde Sheet
34 pages
Lecture 3 - CS50's Introduction To Databases With SQL
No ratings yet
Lecture 3 - CS50's Introduction To Databases With SQL
10 pages
Splunk 20230910-Wa0000
No ratings yet
Splunk 20230910-Wa0000
28 pages
Sample 2
No ratings yet
Sample 2
5 pages
Neo4j Bloom Visual Guide PDF
No ratings yet
Neo4j Bloom Visual Guide PDF
9 pages
Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using LDA-Based Topic Modeling
No ratings yet
Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using LDA-Based Topic Modeling
12 pages
Sample 1
No ratings yet
Sample 1
3 pages
Sample 2
No ratings yet
Sample 2
3 pages
Niranjan Anisetty CV
No ratings yet
Niranjan Anisetty CV
3 pages
Internship
No ratings yet
Internship
19 pages
Sample 2
No ratings yet
Sample 2
1 page
openSAP Hana2ql2 Week 2 Transcript EN
No ratings yet
openSAP Hana2ql2 Week 2 Transcript EN
20 pages
Cursor and Triggers
No ratings yet
Cursor and Triggers
6 pages
MySQL Database Using PHP (Day-1)
No ratings yet
MySQL Database Using PHP (Day-1)
12 pages
CSC 109 Study Guide
No ratings yet
CSC 109 Study Guide
6 pages
Dbms Lab 2: Consider The Tables Given Below
No ratings yet
Dbms Lab 2: Consider The Tables Given Below
7 pages
A State of The Art Review of Distributed Database Technology
No ratings yet
A State of The Art Review of Distributed Database Technology
46 pages
Quick Reference Guide Public Cloud
No ratings yet
Quick Reference Guide Public Cloud
3 pages
Monitoring and Supporting Data Conversion
No ratings yet
Monitoring and Supporting Data Conversion
5 pages
The 2 Dimensions of Audit Information
No ratings yet
The 2 Dimensions of Audit Information
4 pages
Kathmandu University Department of Computer Science and Engineering COMP 232 - Lab Assignment
No ratings yet
Kathmandu University Department of Computer Science and Engineering COMP 232 - Lab Assignment
2 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Venkat Ankam
No ratings yet

Resume 2

Uploaded by

Resume 2

Uploaded by

PROFESSIONAL SUMMERY:-

Change Healthcare, Lombard, IL || Data Engineer Dec 2022 – July 2023

You might also like