0% found this document useful (0 votes)
264 views2 pages

Vijay Kanth - Azure Data Engineer

Uploaded by

maulika5497sjps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
264 views2 pages

Vijay Kanth - Azure Data Engineer

Uploaded by

maulika5497sjps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Vijaya Kanth

AZURE DATA ENGINEER


TX, USA | (551) 310-1471 | | [email protected] | www.linkedin.com/in/avkred
SUMMARY
• Azure Data Engineer with 5 years of progressive experience in designing, developing, and maintaining robust data
pipelines to integrate diverse structured and unstructured data sources.
• Demonstrated expertise in programming languages such as Python, SQL, and R, along with proficiency in essential
packages including NumPy, Pandas, and SciPy.
• Experienced in leveraging cloud platform such as Microsoft Azure (Data factory, Synapse Analytics, Data Lake
Storage, Azure SQL) for data processing, storage, and orchestration, with a focus on Azure Databricks for end-to-
end data processing capabilities.
• Adept at developing resilient data models and schemas using technologies like Apache Hive, Apache Parquet, and
Snowflake, facilitating efficient data storage, retrieval, and analysis.
• Proficient in visualization tools including Tableau, Power BI, and Advanced Excel, enabling insightful data
visualization and analysis for informed decision-making.
• Committed to continuous learning and staying updated with emerging technologies and industry best practices in
data engineering, ensuring the adoption of innovative approaches for solving complex data challenges.
SKILLS
Methodologies: SDLC, Agile, Waterfall, Medallion Architecture
Programming Language: Python, SQL, R
Packages: NumPy, Pandas, Matplotlib, SciPy
Visualization Tools: Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP), Quick Sight
IDEs: Visual Studio Code, PyCharm, Jupyter Notebook, IntelliJ
Database: MySQL, PostgreSQL, MongoDB, SQL Server
Data Engineering Concept: Apache Spark, Apache Hadoop, Apache Kafka, Apache Beam, ETL/ELT, PySpark
Cloud Platforms: Azure (Databricks, Blob Storage, Load Balancer, Synapse Analytics, Data Lake
Storage)
Other Technical Skills: Data Lake, SSIS, SSRS, SSAS, Docker, Kubernetes, Jenkins, Terraform, Informatica,
Talend, Snowflake, Google Big Query, Data Quality and Governance, Machine
Learning Algorithms, Big Data, Advance Analytics, Statistical Methods, Data Mining,
Data Visualization, Data warehousing, Data transformation.
Version Control Tools: Git, GitHub
Operating Systems: Windows, Linux, Mac OS
EXPERIENCE
Azure Data Engineer | United Health Group, TX Jan 2023 - Present
• Developed and maintained scalable Azure Data Factory pipelines using Apache Spark and Python to efficiently
process and transform large volumes of patient health records and claims data, ensuring seamless data integration
and preparation.
• Streamlined and automated data processing workflows, reducing processing times by 50% and ensuring timely and
accurate reporting, which contributed to more efficient operational management.
• Developed and managed data integration processes to consolidate data from SQL Server and various other sources
into a unified pipeline, enabling comprehensive and coherent data analysis.
• Leveraged PySpark within Azure Databricks to perform complex data transformations and aggregations, ensuring
efficient processing and accurate data preparation.
• Employed Hive as a data warehousing solution to store and manage structured healthcare data, supporting complex
analytical queries and reporting needs.
• Managed infrastructure lifecycle using Terraform to ensure the automated creation, updating, and versioning of
Azure resources, facilitating smooth deployments and maintaining the integrity of healthcare data pipelines.
• Optimized real-time data access and analytical performance by leveraging Azure Cosmos DB's change feed feature
for seamless synchronization, while implementing advanced SQL indexing and query optimization techniques to
enhance the performance of complex queries on large-scale healthcare datasets.
• Designed and managed data pipelines within Azure Synapse Analytics to automate ETL processes, streamlining the
extraction, transformation, and loading of healthcare data for further analysis, resulting in a 30% reduction in ETL
processing time and a 25% improvement in data accuracy and consistency.
• Utilized R for advanced statistical analysis and data visualization, creating detailed plots and charts to uncover
trends and patterns in patient health records and claims data.
• Implemented Azure Data Lake Storage to provide scalable, secure, and cost-effective storage solutions for large
volumes of structured and unstructured healthcare data, enabling seamless integration with Azure Data Factory
pipelines for efficient data processing and analysis.
• Developed CI/CD pipelines using Azure DevOps to automate the deployment and testing of Azure Data Factory
pipelines, ensuring consistent and reliable delivery of healthcare data processing workflows across environments
(development, staging, and production).

Azure Data Engineer | Capgemini, India Aug 2018 – Dec 2021


• Designed and implemented efficient data models in Snowflake, enhancing query performance and retrieval
efficiency by 25%, and supporting complex analytical queries.
• Integrated Apache Kafka for real-time data streaming, achieving a latency of under 5 seconds, which facilitated
timely data processing and decision-making.
• Conducted comprehensive performance tuning of ETL processes, resulting in a 40% reduction in processing time
and optimizing overall pipeline efficiency.
• Implemented Azure Data Lake Storage (ADLS) for scalable and secure data storage, enabling efficient handling
of large volumes of structured and unstructured data, and reducing storage costs by 20%.
• Developed automated monitoring and maintenance processes using Azure Monitoring to track model
performance metrics and detect model drift, ensuring adherence to data governance policies and triggering retraining
workflows as needed for optimal accuracy and compliance.
• Developed interactive and visually compelling dashboards in PowerBI, leading to a 20% improvement in data-
driven decision-making by providing actionable insights.
• Developed Python and Scala scripts to automate routine data analysis tasks and model retraining processes,
utilizing PySpark for distributed data processing, which enhanced workflow efficiency, scalability, and consistency
in handling large datasets.
• Developed complex aggregation pipelines in MongoDB to perform data transformations and aggregations, enabling
advanced analytics and insightful reporting on large datasets.
• Configured access policies in Azure Key Vault to manage permissions for various users and applications, ensuring
that only authorized entities could access or modify sensitive information.
• Utilized Kubernetes for resource optimization, ensuring that ETL processes and data streaming services operated
with optimal resource allocation (CPU, memory, etc.), reducing infrastructure costs by dynamically scaling
resources based on workload demands.
• Developed data mapping and integration workflows within SSIS to facilitate seamless data movement between
heterogeneous data sources and destinations, including databases, flat files, and cloud storage.
• Configured Azure Pipelines for automated release management, streamlining the deployment process across
multiple environments and ensuring consistent application delivery.
• Configured and optimized Linux-based servers running data pipelines, with shell scripts used to automate
deployment processes, update dependencies, and ensure servers remained secure and up to date with latest patches.

EDUCATION
Master of Science in Information Science: University of North Texas, USA
Bachelor of Technology in Civil Engineering: Annamacharya Institute of Technology and Sciences, India
CERTIFICATION
• Azure DP-203
• Databricks Data Engineer Associate

You might also like