Databricks Developer Resume
Databricks Developer Resume
Skills
• Python (environment management, testing, project • Microsoft Azure
flow, dev) ~ 6 years, • AWS
• Apache Spark (deployment, resources • Docker, git, bash, Jira, Linux,
optimization, config optimization, applications • MS Office,
efficiency monitoring, dev) ~ 4 years,
• Power BI,
• Hadoop ecosystem (Hadoop on-premises and • English – C1, German – A2,
EMR, HDFS management, yarn, Sqoop, Hive, Impala)
~ 2 years, • Scala and Java for Data Engineering
• Databricks (dev, admin, integration, migration) ~
4 years
• SQL (SQL Server, PostgreSQL) ~ 6 years,
Experience
APRIL 2021 – present
Contractor
Advised on Data Lake maintenance and expansion (banking sector, EU, as Lead
Data Engineer):
• Apache Spark / Airflow / AWS development (process / code / architecture) for Data Lake.
• Built analytical platform upon Databricks on AWS (resolving matters like scalability/data security in the
cloud/ IaC automatization)
• Reduced prod AWS EMR processing costs by 25% and decreased downtime by 37%.
Environment: AWS (S3, IAM, Lambda, EC2, RDS, DynamoDB, Kinesis, Glue, EMR), Databricks
Tools: Airflow, Terraform, Python, Scala, bash, git, docker, GitHub, GitHub Actions, Apache Spark
Other: scrum methodology
Built a data processing framework for FHIR format compliant data (medical sector,
US).
• Developed FHIR format – Azure – Databricks integration framework (also automated cucumber / pytest-bdd
test framework)
• Troubleshot Delta Live Tables jobs.
I hereby give consent for my personal data included in my application to be processed for the purposes of the recruitment process under the Regulation (EU) 2016/679
of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free
movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).
Environment: Azure, Databricks
Tools: Python, git, Azure Repos, Azure Pipelines, Apache Spark
Other: scrum methodology
Designed Apache Airflow architecture for an MFT business case (energy sector, PL).
I hereby give consent for my personal data included in my application to be processed for the purposes of the recruitment process under the Regulation (EU) 2016/679
of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free
movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).
Cost to Serve and SCM network optimization (Retail)
Cloudera Hadoop cluster administration:
• Configured nodes / roles, installed / updated software.
• Performance monitoring, and troubleshooting.
• Prepared and maintained a working environment for Data Scientists (JupyterHub, Cloudera Data Science
Workbench, mlflow, RStudio Server, etc.)
• Completed Cloudera Administrator Training for Apache Hadoop
Environment: on-premises
Tools: Linux, Ansible, Hadoop, Apache Spark, Hive, Impala, Kafka, Nifi, Flume
Education
OCTOBER 2018 –
I hereby give consent for my personal data included in my application to be processed for the purposes of the recruitment process under the Regulation (EU) 2016/679
of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free
movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation).