0% found this document useful (0 votes)
46 views6 pages

Ajay Data Engineer Resume

Ajay is a Senior Data Engineer with over 8 years of IT experience, specializing in database design, ETL processes, and data analytics across various platforms including Oracle, AWS, and Azure. He has extensive hands-on experience with tools like Informatica, Hadoop, and Snowflake, and is skilled in building CI/CD pipelines and optimizing data workflows. His professional background includes roles at Wells Fargo, The Hartford, and CareFirst, where he developed and implemented data solutions for large-scale projects.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views6 pages

Ajay Data Engineer Resume

Ajay is a Senior Data Engineer with over 8 years of IT experience, specializing in database design, ETL processes, and data analytics across various platforms including Oracle, AWS, and Azure. He has extensive hands-on experience with tools like Informatica, Hadoop, and Snowflake, and is skilled in building CI/CD pipelines and optimizing data workflows. His professional background includes roles at Wells Fargo, The Hartford, and CareFirst, where he developed and implemented data solutions for large-scale projects.

Uploaded by

pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Ajay

Phone: (614) 653-6384


[email protected]
Sr. Data Engineer

PROFESSIONAL SUMMARY:

 8+ years of Professional IT Experience in database design, performance tuning and optimization of


core Oracle Database 10g, 11g, and 12c, Data Warehousing, ETL (Extract, Transform, and Load)
and Data Analytics.
 4 years of extensive experience in Informatica Power Center 10.x/9.x/8.x, responsible for
supporting enterprise level ETL architecture.
 Hands-on experience in building devops pipelines for CI/CD.
 Around 1 year of experience in Hadoop development of enterprise level solutions utilizing Hadoop
utilities such as Pyspark, MapReduce, Sqoop, PIG, Hive, HBase, Oozie, Flume etc. Worked on
proof of concepts on Kafka, and Storm.
 Good Knowledge on Oracle Goldengate in Oracle 9i/10g/11g/12c, worked with DB-Ops teams
where they performed tasks like software installations, migrations, database capacity planning,
automated backup implementation, performance tuning on Linux/Unix and Windows platform.
 Utilized Kubernetes and Docker for the runtime environment for the CI/CD system to build, test,
and deploy.
 Having Experience in implementing Data warehouse solutions in Confidential Redshift.
 Worked on various projects to migrate data from on premise databases to Confidential Redshift,
RDS and S3.
 Experience in design, testing, implementation, maintenance, and control of the organization's
physical, relational, and object-oriented databases across multiple platforms and computing
environments.
 Hands-on experience in Amazon Web Services (AWS) provisioning and good knowledge on AWS
services like Elastic Compute Cloud (EC2), Lambda, Simple Storage Service (S3), Auto Scaling, AWS
Glue, AWS Batch, DynamoDB, IAM, Virtual Private Cloud (VPC), Route53, Cloud Watch, AWS CLI,
Cloud Formation, ELB (Elastic Load Balancers), RDS, SNS, SQS, and EBS etc.,
 Experience in Optimization and Tuning SQL queries using TRACE FILE, TKPROF, EXPLAIN PLAN.
 Extensive experience in developing Stored Procedures, Functions, Views and Triggers, Complex
SQL queries using SQL Server, TSQL and Oracle PL/SQL.
 Experience in Oracle supplied Packages, Dynamic SQL and PL/SQL Tables.
 Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL.
 Experience loading Data into Oracle Tables using SQL Loader.
 Exposure to NoSQL databases such as MongoDB, DynamoDB, and Cassandra.
 Expertise in extraction, transformation and loading data from heterogeneous systems like flat
files, excel, Oracle, Teradata, MSSQL Server.
 Strong experience in Extraction, Transformation and Loading (ETL) data from various sources into
Data Warehouses and Data Marts using Informatica Power Center (Repository Manager,
Designer, Workflow Manager, Workflow Monitor, Metadata Manger), Power Exchange, Power
Connect as ETL tool on Oracle, DB2 and SQL Server Databases.
 Expertise with Scala, SQL, Linux script, Python, Spark and Big Data toolset.
 Experience building and optimizing AWS data pipelines, architectures and data sets.
 Extensively used Informatica Client tools Source Analyzer, Warehouse designer, Mapping
designer, Mapplet Designer, ETL Transformations, Informatica Repository Manager and
Informatica Server Manager, Workflow Manager & Workflow Monitor.
 Extensive experience as Hadoop, Snowflake, Cloud and Spark engineer and Big Data analyst
 Extensive knowledge of Data Modeling, Data Conversions, Data Integration and Data Migration
with specialization in Informatica Power Center.
 Have a good knowledge with data warehouses like Oracle, Snowflake and Teradata.
 Extensive experience in developing UNIX Shell Script, Perl, Windows Batch Script and PowerShell
to automate ETL processes.
 Have good exposure with Talend’s Data Integration, ESB, MDM and Big Data tools.
 Hands on experience in various open-source Apache Technologies such as Hadoop, Avro, ORC,
Parquet, Spark, HBase, Drill, Presto, Talend, Airflow, Flume, Ambari, Kafka, Oozie, Zookeeper,
Camel, etc.
 Good Understanding of Hadoop Architecture and various components such as HDFS, Job Tracker,
Task Tracker, Name Node, Data Node, MapReduce and ELT concepts.
 Experience in Elasticsearch and MDM solutions.
 Worked on message-oriented architecture with RabbitMQ and Kafka as a Message Broker option.
 Well-versed in version control tools such as SVN, GIT, Bitbucket, etc.
 Experience working on CI/CD pipelines for web applications and hosting them on cloud AWS,
using Jenkins, Shell scripting.
 Strong experience in design and development of Business Intelligence solutions using Data
Modeling, Dimension Modeling, ETL Processes, Data Integration, OLAP.
 Experience in resolving on-going maintenance issues and bug fixes, monitoring Informatica
sessions as well as performance tuning of mappings and sessions.
 Experience in developing custom for Pig and Hive to incorporate methods and functionality of
Python/Java into Pig Latin and HQL (HiveQL).
 Experience in using logging frameworks like Log4J and SLF4J and monitoring the metrics using
Splunk, Zipkins and Grafana.
 Active involvement in designing and developing real-time projects/enterprise applications, starting
from the requirements analysis/design stages and through the whole Software Development Life
Cycle.
 Worked closely to review pre- and post-processed data to ensure data accuracy and integrity with
Dev and QA teams.
 Experience in various cloud vendors like AWS, GCP and Azure.
 Strong skills in algorithms, data structures, Object oriented design, Design patterns,
documentation, and QA/testing.
 Experience in various agile methodologies like Test Driven Development (TDD), SCRUM.
 Strong communication, leadership, organizational, analytical, interpersonal skills, and strong
problem-solving skills.

TECHNICAL SKILLS:

Languages SQL, PL/SQL, Python, Scala, PySpark


Oracle 9i/10g/11g/12, SQL Server 2000/2005, MS SQL, Mongo DB, MySql,
Databases
Cassandra, DynamoDB, PostgreSQL
Bigdata Technologies Hadoop, Hive, Pig, Azure, GCP, Flume, Sqoop, Spark, MapReduce, Oozie
ETL Tools Informatica Power Center 10.x/9.x/8.x, SQL Loader, Talend
Development IDE`s Eclipse, Visual Studio Code, Jetbrains, Toad, Sql Developer
Logging & Monitoring Splunk, CloudWatch, Log4J, SLF4J, Zipkins, Grahana
Operating Systems UNIX, Linux, Ubuntu, Windows XP/2000/VISTA
Unix & Linux Unix Grid computing and Shell Scripting
AWS (Lambda, EC2, S3, SNS, CloudWatch, Cloud Formation Template,
Cloud Technologies RDS, VPC, Auto Scaling, IAM, AWS Glue, AWS Batch, AWS DMS, Code
Build, Code Deploy etc.,)
Version Control Tools CVS, SVN, GitHub, and Bitbucket
Test Frameworks Junit, Mockito
Other Tools Putty, WinScp, VMWare, Git Bash, Control-M
Data Modeling Star Schema and Snowflake - schema

PROFESSIONAL EXPERIENCE:

Wells Fargo, St. Louis-MO Jun 2020 – Till Date


Azure Data Engineer

Responsibilities:
 Analyze, design, and build Modern data solutions using Azure PaaS service to support visualization
of data. Understand current Production state of application and determine the impact of new
implementation on existing business processes.
 Extract Transform and Load data from Sources Systems to Azure Data Storage services using a
combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data
Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW)
and processing the data in In Azure Databricks.
 Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load
data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back
tool and backwards.
 Creating internal and external stage for data load.
 I have created stored procedures, functions, views where I used them in my ETL operations in
order to extract data from source.
 Loading the data from flattened file to snowflake database and Developing framework and
database structures in snowflake.
 Developing sql queries using SnowSql
 Creating internal and external stage for data load.
 Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data
bricks cluster.
 Hands-on experience with Snowflake utilities, SnowSQL, SnowPipe, Big Data model techniques
using Python / Java.
 Compared Self hosted Hadoop with respect to GCPs Data Proc, and explored Big Table (managed
HBase) use cases, performance evolution.
 ETL pipelines in and out of data warehouse using combination of Python and Snowflakes SnowSQL
Writing SQL queries against Snowflake
 Experience in building and architecting multiple Data pipelines, end to end ETL and ELT process for
Data ingestion and transformation in GCP.
 Developed data warehouse model in snowflake for over 100 datasets using whereScape.
 Heavily involved in testing Snowflake to understand best possible way to use the cloud resources.
 Developed ETL workflows using NiFl to load data into Hive and Teradata.
 To meet specific business requirements wrote UDF’s in Scala and Pyspark.
 Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the
data using the Sql Activity.
 Worked on a project where we migrated exiting project that is from Hadoop to big query and then
we are migrating our entire project to GCP
 Hands-on experience on developing SQL Scripts for automation purpose.
 Created Build and Release for multiple projects (modules) in production environment using Visual
Studio Team Services (VSTS).

Environment: Azure Cloud, Azure Data Factory (ADF v2), Azure functions Apps, Azure DataLake, BLOB
Storage, SQL server, Teradata Utilities, Windows remote desktop, UNIX Shell Scripting, AZURE PowerShell,
Data bricks, Python, Erwin Data Modelling Tool, Azure Cosmos DB, Azure Stream Analytics, Azure Event
Hub, Azure Machine Learning.

The Hartford, NY Mar 2019 to May 2020


AWS Data Engineer

Responsibilities:
 Designed and setup Enterprise Data Lake to provide support for various uses cases including
Analytics, processing, storing and Reporting of voluminous, rapidly changing data.
 Responsible for maintaining quality reference data in source by performing operations such as
cleaning, transformation and ensuring Integrity in a relational environment by working closely
with the stakeholders & solution architect.
 Designed and developed Security Framework to provide fine grained access to objects in AWS S3
using AWS Lambda, DynamoDB.
 Set up and worked on Kerberos authentication principals to establish secure network
communication on cluster and testing of HDFS, Hive, Pig and MapReduce to access cluster for new
users.
 Performed end- to-end Architecture & implementation assessment of various AWS services like
Amazon EMR, Redshift, S3.
 Implemented the machine learning algorithms using python to predict the quantity a user might
want to order for a specific item so we can automatically suggest using kinesis firehose and S3
data lake.
 Used AWS EMR to transform and move large amounts of data into and out of other AWS data
stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon
DynamoDB.
 Used Spark SQL for Scala & amp, Python interface that automatically converts RDD case classes to
schema RDD.
 Import the data from different sources like HDFS/HBase into Spark RDD and perform
computations using PySpark to generate the output response.
 Creating Lambda functions with Boto3 to deregister unused AMIs in all application regions to
reduce the cost for EC2 resources.
 Migrate Data into RV Data Pipeline using DataBricks, SparkSql and Scala.
 Importing & exporting database using SQL Server Integrations Services (SSIS) and Data
Transformation Services (DTS Packages).
 Coded Teradata BTEQ scripts to load, transform data, fix defects like SCD 2 date chaining, cleaning
up duplicates.
 Developed reusable framework to be leveraged for future migrations that automates ETL from
RDBMS systems to the Data Lake utilizing Spark Data Sources and Hive data objects.
 Conducted Data blending, Data preparation using Alteryx and SQL for Tableau consumption and
publishing data sources to Tableau server.
 Developed Kibana Dashboards based on the Log stash data and Integrated different source and
target systems into Elasticsearch for near real time log analysis of monitoring End to End
transactions.
 Implemented AWS Step Functions to automate and orchestrate the Amazon SageMaker related
tasks such as publishing data to S3, training ML model and deploying it for prediction.
 Integrated Apache Airflow with AWS to monitor multi-stage ML workflows with the tasks running
on Amazon SageMaker.

Environment: AWS EMR, S3, RDS, Redshift, Lambda, Boto3, Dynamo DB, Amazon SageMaker, Apache
Spark, HBase, Apache Kafka, HIVE, SQOOP, Map Reduce, Snowflake, Apache Pig, Python, SSRS, Tableau.
CareFirst, Baltimore, MD Dec 2017 – Feb 2019
Big Data Engineer/Hadoop Developer

Responsibilities:
 Involved in the process of data acquisition, data pre-processing and data exploration of
telecommunication project in Scala.
 Installed and configured Hive, Pig, Sqoop, Oozie on the Hadoop cluster by Setting up and
benchmarked Hadoop clusters for internal use.
 Developed and implemented data acquisition of Jobs using Scala that are implemented using
Sqoop, Hive & Pig for optimization of MR Jobs to use HDFS efficiently by using various
compression mechanisms with the help of Oozie workflow.
 In preprocessing phase of data extraction, we used Spark to remove all the missing data for
transforming of data to create new features.
 For data exploration stage used Hive to get important insights about the processed data from
HDFS.
 Handled importing of data from various data sources, performed transformations using Hive and
MapReduce for loading data into HDFS and extracted the data from MySQL into HDFS using
Sqoop.
 Used UDF’s to implement business logic in Hadoop by using Hive to read, write and query the
Hadoop data in HBase.
 Used Cloudera Manager continuous monitoring and managing of the Hadoop cluster for working
application teams to install operating system, Hadoop updates, patches, version upgrades as
required.
 Developed data pipelines using Sqoop, Pig and Hive to ingest customer member data, clinical,
biometrics, lab and claims data into HDFS to perform data analytics.
 Experience in designing and developing POCs in Spark using Scala to compare the performance of
Spark with Hive and SQL/Oracle.
 Used Oozie workflow engine to run multiple Hive and Pig Scripts with the help of Kafka for the
real-time processing of data to navigate through data sets in the HDFS storage by loading Log File
data directly into HDFS.
 Worked with different actions in Oozie to design workflow like Sqoop action, pig action, hive
action, shell action & Java action.
 Analyzed substantial amounts of data sets to determine optimal way to aggregate and report on
it.
 Loaded and transformed large sets of structured, semi structured, and unstructured data using
PIG by importing data using Sqoop to load and export data from My SQL to HDFS and NoSQL
Databases on regular basis for designing and developing PIG scripts to process data in a batch to
perform trend analysis of data.
 Experience in designing and developing POCs in Spark using Scala to compare the performance of
Spark with Hive and SQL/Oracle.
 Developed data pipelines using Sqoop, Pig and Hive to ingest customer member data, clinical,
biometrics, lab, and claims data into HDFS to perform data analytics.
 Developed Sqoop scripts to handle change data capture for processing incremental records
between new arrived and existing data in RDBMS tables.
 Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on
the dashboard.
 Created Hive base script for analyzing requirements and for processing data by designing cluster
to handle huge amount of data for cross examining data loaded in Hive and Map Reduce jobs.

Environment: MapReduce, Hive, Pig, My SQL, Cloudera Manager, Sqoop, Oozie, No SQL, Eclipse,Oozie,
Cloudera Distribution with Hadoop (CDH4), MySQL, CentOS, Apache HBase, HDFS, MapReduce, Hue, Hive,
PIG, Sqoop, SQL, Windows, Linux.
Radiant Technologies, Hyderabad, India Jul 2013 – Aug
2017
SQL Developer/Hadoop Developer

Responsibilities:
 Involved in gathering requirements from the Business Users and producing technical, functional
specifications for the reporting system. Study and understand the Business Scenarios of the
existing systems, understanding the data and validating the data by applying the current business
rules.
 Written Batch Program, SQL queries, stored procedures, functions, and triggers using Oracle DB.
 Performed Unit testing and Functionality testing of the financial module based on test scenarios
and validating the data.
 Queries in PL/SQL, performance tuning.
 Automated the mailing process using SMTP Mail class.
 Responsible for the analysis, troubleshooting and problem solving of technical issues.
 Understanding the processes & new core functionality changes and developing queries.
 Creation of database objects like tables, views, procedures, and packages using oracle tools like
PL/SQL Developer
 Creating or modifying existing DB objects (Procedures, Packages, Triggers, Views and Tables, etc.)
on Business Enhancement projects and keeping the current business running by participating in
maintenance of DB.
 Developed views to facilitate easy interface implementation and enforce security on critical
customer information.
 Preparing unit test cases and reviewing the test cases prepared by teammates.
 Creating or modifying existing DB objects (Procedures, Packages, Triggers, Views and Tables, etc.)
on Business Enhancement projects and keeping the current business running by participating in
maintenance of DB.
 Importing and exporting data into HDFS and Hive using Sqoop.
 Experienced in defining job flows.
 Got good experience with NOSQL database SOLR HBase.
 Involved in creating Hive tables loading with data and writing hive queries which will run internally
in map reduce way.
 Developed a custom Filesystem plug in for Hadoop so it can access files on Data Platform.
 This plugin allows Hadoop MapReduce programs HBase Pig and Hive to work unmodified and
access files directly.
 Designed and implemented MapReduce-based large-scale parallel relation-learning system
 Extracted feeds form social media sites such as Facebook Twitter using Python scripts.

Environment: Oracle 11g, UNIX, Putty, SQL Developer tool.

Education:

 Bachelor’s in computer science engineering Graduated- 2013.

You might also like