0% found this document useful (0 votes)

47 views4 pages

Bikash Jha CV Geospatial

The document provides details about the experience and skills of Bikash Jha, a data engineer with over 7 years of experience working with big data. It outlines roles at companies like Planet Labs and Homelike Internet GmbH where responsibilities included building data pipelines and databases, implementing serverless architectures, and developing machine learning models.

Uploaded by

bikash jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views4 pages

Bikash Jha CV Geospatial

Uploaded by

bikash jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Leverkusen, Germany 51381

Bikash Jha +49 (0) 157 50325815

Data Engineer [email protected]

linkedin.com/in/bikash-jha/

Profile Skills: Data Structures, Algorithm,

I’m a driven data engineer with 7+ years of experience playing with big datasets. I have Serverless, Data Pipelines
experience and knowledge in a diverse set of disciplines, technologies and tools like Data Technologies: Python 2/3,Golang,
Science, Cloud and data pipelines, and data architecture. Spark, Spark Streaming, Elastic,
Kubernetes, Docker, Airflow
Looking for a career opportunity to apply my skills/experience on challenging projects. Eager
Cloud: AWS, GCP
to build robust databases that lay the groundwork for revealing game-changing insights for
Databases: MySQL, MongoDB,
the business. involving extensive use of current IT techniques to contribute productively
towards the growth of the company thus growing professionally. DynamoDB, PostGre, HBase
Big Data: Kafka, MapR/Cloudera, Hive,
Zookeeper, Oozie, HBase
Monitoring : Grafana , Loki,
Prometheus, StackDriver Logging

Experience
Planet Labs, Berlin, Germany
Senior Data Engineer
OCT 2021 - CURRENT
○ Geospatial Infrastructure & Data Platform Management (Planetary-Variables):
○ Orchestrated the design and management of a comprehensive infrastructure to ingest and process geospatial data from a
constellation of 250+ satellites, including the integration of machine learning models to detect and analyze changes in forest
cover using GeoDiff.
○ Created specialized data structures to accommodate both vector (e.g., feature collection, multi-polygons, points) and raster
(e.g., satellite imagery, elevation models) geospatial datasets, enhancing processing capabilities for detailed deforestation
analysis and environmental monitoring.
○ Real-Time Processing & Advanced Geospatial Operations (Planetary-Variables)::
○ Leveraged PubSub/PubSub-Lite and Apache Spark for real-time geospatial data streaming and on-the-fly analytics,
incorporating machine learning algorithms to identify and respond to rapid environmental changes indicative of
deforestation.
○ Conducted spatial join operations, proximity analysis, and advanced analytics, contextualizing incoming satellite data with
historical deforestation patterns through GeoDiff analysis, enabling timely detection and response strategies.
○ Data Storage, Schema Design & Spatial Indexing:
○ Designed schemas in BigQuery and Bigtable, incorporating spatial indices for optimized query performance on petabytes of
geospatial data, including layers specific to deforestation tracking and monitoring.
○ Implemented a Data Lake (GCS) to store geoParquet formatted specifically for machine learning models focused on
detecting deforestation activities, ensuring efficient data management and accessibility for deep learning applications.
○ Geospatial Query Development & Spatial Analysis:
○ Formulated complex SQL and spatial SQL queries in BigQuery and Bigtable to extract, transform, and analyze geospatial
data, utilizing machine learning outputs to monitor deforestation and land use changes.
○ Employed advanced geospatial techniques such as geostatistical analyses, temporal and spatial trend detection, and
business modeling to provide deep insights for Conservation Service Managers (CSMs), enhancing decision-making in
forest conservation efforts and enabling predictive analysis of environmental impact.
○ Pipeline Orchestration, Backfill & Spatial ETL Operations:
○ Deployed Apache Airflow on a GKE cluster, integrating geospatial ETL Dags with gcsfuse and PubSub.
○ Managed backfill operations with Airflow, ensuring the spatial integrity and accuracy of geospatial datasets over time.
○ Deployment of Spark on GKE-K8s to pod to run spark jobs on Kubernetes with Autoscaling Enabled.
● Reporting & Backend Integration:
○ Golang-Based Reporting System: Implemented a Golang and GORM-based backend for dynamic reporting
○ Real-Time Transaction Processing and Data Integration: Employed Golang to process and integrate millions of transactional
events in real-time, ensuring instant calculations and seamless data flow from PubSub into PostgreSQL and Redis.
○ Developed a Golang library for geometry validation, supporting various shapes such as multipolygons, features, and feature
collections.
○ Legacy Support, Enhancements & Monitoring:
○ Maintained and enhanced legacy batch pipelines on Apache Beam
○ Grafana setup and created metric in grafana (LogQL, StackDriver, Prometheus)
○ Slack integration with grafana and airflow.
○ Gitlab CI/CD , Terraform

Tech Stack: DeepLearning,GCP, BigQuery, Bigtable,Dask, Spark,PubSub,Kubernetes, Python, Golang, Airflow ,Qgis, Mapbox,
PubSub-Lite, GitLab CI,Terraform, Postgres, Redis,

Homelike Internet GmbH, Cologne, Germany

Senior Data Engineer
OCT 2021 - CURRENT
○ Managing 6 different ETL processes to transfer data across MongoDB, BigQuery, SalesForce and several Marketing Channels
○ Designing serverless framework for real-time consumption of user tracking data using kinesis
○ Implementation of Dead-Letter-Queue in Kinesis data stream (Kinesis to lambda to SQS)
○ Conceptualisation of a new Microservices architecture using:
○ Elastic Kubernetes Cluster on AWS
○ E/L/K Stack for monitoring
○ Airflow on Kubernetes with git sync
○ Kafka on K8s
○ Spark on K8s to run pyspark jobs

Tech Stack: GCP, BigQuery, AWS, Lambda, Kinesis, SQS, CloudWatch, Kubernetes, Python

Aurigo Software Technologies, Bangalore, India

Senior Data Engineer
JAN 2021 - SEP 2021
Company Description : Helps state agencies, cities, counties, water authorities, airports, and facility owners plan, build and maintain
capital assets, infrastructure and facilities by combining Data Engineering, Democratized Data Science and Data Orchestration (link)
Project: Serverless / Managed AWS services
○ Implementation of Amazon Kinesis Firehose to collect Realtime, streaming data
○ Designing AWS S3 as the Data Lake to store all Raw Data.
○ Setting up AWS Athena to analyze data in Amazon S3 using standard SQL.
○ Kubernetes on AWS EKS Cluster using Cloud Formation Template.
○ EKS Cluster Auto-scaler and setting up Amazon ECS for container orchestration.
○ AWS lambda function
Project: Microservices Components on Kubernetes.
○ Writing Connector Microservices to fetch data from different sources.
○ Pyspark/PandasDF/Boto3 Connector: Twitter, S3 bucket, DynamoDB, Filesystem.
○ Building Machine Learning Models in Microservice.
○ Dag Engine Microservices: Integration engine for all microservices (ML + Connector)
○ Airflow Microservices: programmatically author and schedule their workflows and monitor them via the built-in Airflow user
interface.
○ Attaching EFS volume mount/AWS RDS instance to Airflow Kubernetes Pods.
○ Fluentd microservices and Elastic stack to fetch/store logs from Airflow.
○ Implementing spark-on-k8s operator pod to run spark jobs on Kubernetes.
○ ML Models: Bert, NLP, PyTorch, Hugging Face libraries.
○ Monitoring Kubernetes cluster health using K8s Dashboard, Prometheus and Grafana.
○ Exploring the GCP Platform for future POC’s and Kafka Microservice.

Tech Stack: Linux, AWS, Kubernetes, Docker, Python, Airflow, DynamoDB

LTI (Larsen & Toubro Infotech), Pune, India

Senior Product Engineer
JAN 2020 - DEC 2020
Company Description : LTI Mosaic Decisions Platform
Responsibilities:
○ Refactoring existing code and evolving new architecture fit for our existing product
○ Writing Connector code using spark/ pySpark / PandasDF for Azure/S3/NO SQL etc
○ Kubernetes setup on AMAZON EKS cluster and implementation of key vault (HashiCorp)
○ Writing our own Kubernetes API to submit spark jobs on K8s and KubeSpawner APIs.
○ Horizontal Autoscaling on Kubernetes Pods and optimization of existing spark Jobs.

Tech Stack: Python , Amazon S3, Azure Blob, Cosmos DB, MongoDB, Kafka, Kubernetes, PySpark

AMDOCS, Pune, India

Software Developer
AUG 2016 - DEC 2019
Company Description : Platform that implements business logic and Allows the marketer to inject machine learning logic run on a
big data system for any decision that needs to be taken within the experience ( Email/Message text etc.)

Project: Auto-ML Product Recommendation

○ Setup/Code for consumers/Producer and offset management in Kafka streaming (Listener balancer)
○ Cleaning up data and data preparation for Model Creation using Pyspark
○ Implementation of ML model (Random Forest/Linear Regression
based on demographic data and supervised learning to predict potential customers
○ Analysis of target customer feedback and provide insight to marketing team

Tech Stack: Pyspark, Python, MapR, Kafka, Hive, Random Forest

Education
University Of Kalyani, West Bengal
Bachelor of Technology in 2016 Information Technology
2016

Awards: Awarded as Employee of the Month 2017, Awarded as Employee of the Quarter Q2 ‘18, On-Site Opportunities - Mexico for AT&T and
Manila for PLDT.

Other Interesting Facts

Languages: English (Professional), Hindi (Native), Bengali (C1)
Interests: contributing to open Source Platform , Volleyball, Cricket

Reference
Available upon request

Rajesh DataEngineer
No ratings yet
Rajesh DataEngineer
7 pages
IT 111-Introduction To Computing I. MODULE III - Key Components of A Computer System, Operating Systems
No ratings yet
IT 111-Introduction To Computing I. MODULE III - Key Components of A Computer System, Operating Systems
25 pages
Order Block Trading Strategy With Examples - Dot Net Tutorials
No ratings yet
Order Block Trading Strategy With Examples - Dot Net Tutorials
19 pages
Sysview 170 Command Help 2024-03-01
No ratings yet
Sysview 170 Command Help 2024-03-01
5,388 pages
Volvo Intermediate Storage File Encryptor/Decryptor: History
No ratings yet
Volvo Intermediate Storage File Encryptor/Decryptor: History
1 page
2022 - OK - UML - Simon Brown - The C4 Model For Visualising Software Architecture-Leanpb
No ratings yet
2022 - OK - UML - Simon Brown - The C4 Model For Visualising Software Architecture-Leanpb
106 pages
RAJU AWS Data Engineer Resume
No ratings yet
RAJU AWS Data Engineer Resume
6 pages
Rohith DE
No ratings yet
Rohith DE
7 pages
ML Experion Integration
No ratings yet
ML Experion Integration
148 pages
Satya Sandeep - Data Engineer Resume
No ratings yet
Satya Sandeep - Data Engineer Resume
8 pages
Yasaswi-Sr Data Engineer-Resume
100% (1)
Yasaswi-Sr Data Engineer-Resume
4 pages
Helm Chart Notes
100% (2)
Helm Chart Notes
8 pages
Resume Data Engineer
No ratings yet
Resume Data Engineer
8 pages
Guru Data Resume
No ratings yet
Guru Data Resume
6 pages
ITIL 4 Foundation - 7 Questions
No ratings yet
ITIL 4 Foundation - 7 Questions
50 pages
ISO 228-2 Pipe Threads Where Pressure-Tight Joints Are Not Made On Threads
No ratings yet
ISO 228-2 Pipe Threads Where Pressure-Tight Joints Are Not Made On Threads
20 pages
CM-1 Worksheets
No ratings yet
CM-1 Worksheets
26 pages
Apachesim Ve 2014 Session A Training Notes PDF
No ratings yet
Apachesim Ve 2014 Session A Training Notes PDF
23 pages
Ravali Data Engineer GCP
No ratings yet
Ravali Data Engineer GCP
8 pages
Adithya Jatangi: Professional Summary
No ratings yet
Adithya Jatangi: Professional Summary
7 pages
John Pual
No ratings yet
John Pual
10 pages
Gagan
No ratings yet
Gagan
8 pages
Pavani Senior Data Engineer Professional Summary
No ratings yet
Pavani Senior Data Engineer Professional Summary
6 pages
Deepak (Sr. Data Engineer)
No ratings yet
Deepak (Sr. Data Engineer)
10 pages
Java Assignment - 1 22-7-24
No ratings yet
Java Assignment - 1 22-7-24
20 pages
Manideep Resume IXL
No ratings yet
Manideep Resume IXL
9 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
BATCH MANAGEMENt
No ratings yet
BATCH MANAGEMENt
30 pages
DataEngineer Shreya Hadoop
No ratings yet
DataEngineer Shreya Hadoop
9 pages
Dinesh Katla AWS Backend Data Engineer Updated
No ratings yet
Dinesh Katla AWS Backend Data Engineer Updated
10 pages
DataEngineer Shreya AWS
No ratings yet
DataEngineer Shreya AWS
9 pages
Sanjana Data Engineer
No ratings yet
Sanjana Data Engineer
4 pages
Anvesh - Sr. Data Engineer
No ratings yet
Anvesh - Sr. Data Engineer
6 pages
Chandralekha Rao Yachamaneni
No ratings yet
Chandralekha Rao Yachamaneni
7 pages
Mucharla Shiva Kumar Goud - Leaddata Engineer
No ratings yet
Mucharla Shiva Kumar Goud - Leaddata Engineer
5 pages
Shiva Data - Resume
No ratings yet
Shiva Data - Resume
6 pages
Ankit Data Engineer Resume
No ratings yet
Ankit Data Engineer Resume
8 pages
Ravi Shankar Chittela DataEngg
No ratings yet
Ravi Shankar Chittela DataEngg
10 pages
Suharshini - Data - Engineer - Python
No ratings yet
Suharshini - Data - Engineer - Python
8 pages
Data Analyst 3
No ratings yet
Data Analyst 3
5 pages
Lakshmi DE
No ratings yet
Lakshmi DE
3 pages
Naveen's Resume - AWS DE
No ratings yet
Naveen's Resume - AWS DE
5 pages
Sahithi Devi
No ratings yet
Sahithi Devi
10 pages
Abhilash Resume
No ratings yet
Abhilash Resume
5 pages
Linux (Open Source License and Patent) PPT
No ratings yet
Linux (Open Source License and Patent) PPT
37 pages
Resume 1
No ratings yet
Resume 1
7 pages
Sai Vodnala DE
No ratings yet
Sai Vodnala DE
5 pages
Sai Krishna Sr. Big Data Engineer
No ratings yet
Sai Krishna Sr. Big Data Engineer
8 pages
Sai Kruthik Reddy Data Engineer
No ratings yet
Sai Kruthik Reddy Data Engineer
9 pages
Sandeep Julakanti - Resume
No ratings yet
Sandeep Julakanti - Resume
9 pages
COA Mod5
No ratings yet
COA Mod5
32 pages
Varun Resume h1b
No ratings yet
Varun Resume h1b
3 pages
SSREDDY
No ratings yet
SSREDDY
8 pages
Nikhil Kumar Mutyala - Senior Big Data Engineer
No ratings yet
Nikhil Kumar Mutyala - Senior Big Data Engineer
7 pages
Manoj DE
No ratings yet
Manoj DE
6 pages
Rahul Reddy Resume
No ratings yet
Rahul Reddy Resume
2 pages
Jyostna DataEngineer GCEAD
No ratings yet
Jyostna DataEngineer GCEAD
5 pages
J OHN
No ratings yet
J OHN
8 pages
Advances in Computer Systems Architecture 8 Acsac 2003 2823 2003 354020122x 419s
No ratings yet
Advances in Computer Systems Architecture 8 Acsac 2003 2823 2003 354020122x 419s
419 pages
Major Report R.
No ratings yet
Major Report R.
20 pages
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
No ratings yet
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
7 pages
GitHub Copilot Impact
No ratings yet
GitHub Copilot Impact
10 pages
Nagaraju Bachu
No ratings yet
Nagaraju Bachu
6 pages
A Shlash-ML Data Engineer CV
No ratings yet
A Shlash-ML Data Engineer CV
5 pages
Anil Kumar: Data Engineer
No ratings yet
Anil Kumar: Data Engineer
8 pages
Akash Spark
No ratings yet
Akash Spark
6 pages
Jimmy Lamba Resume PDF
No ratings yet
Jimmy Lamba Resume PDF
8 pages
Bikash - Jha-CV - Docx (1) 2
No ratings yet
Bikash - Jha-CV - Docx (1) 2
3 pages
Jayasree Yedlapally: Data Architecture Engineering - Senior
No ratings yet
Jayasree Yedlapally: Data Architecture Engineering - Senior
5 pages
Log4j Quick Guide
No ratings yet
Log4j Quick Guide
18 pages
Saurav Dudulwar Resume
No ratings yet
Saurav Dudulwar Resume
1 page
Police Officer Exam 7326
No ratings yet
Police Officer Exam 7326
97 pages
Advanced C Programming: Declarations, External Names, Memory Layout
No ratings yet
Advanced C Programming: Declarations, External Names, Memory Layout
32 pages
Praveen Data Engineer Resume AWS
No ratings yet
Praveen Data Engineer Resume AWS
3 pages
Swapnik DE
No ratings yet
Swapnik DE
6 pages
Sindhusha Boyapati
No ratings yet
Sindhusha Boyapati
4 pages
IT - DP Weekly Report - Mar 1 2024
No ratings yet
IT - DP Weekly Report - Mar 1 2024
55 pages
Sushant Vairat
No ratings yet
Sushant Vairat
1 page
Jim Xiang: - Santa Clara, CA
No ratings yet
Jim Xiang: - Santa Clara, CA
5 pages
Aleaud Acknowledgments With Idoc - Aae Adapter 1
No ratings yet
Aleaud Acknowledgments With Idoc - Aae Adapter 1
7 pages
Sukesh Kotamreddy Arrowstreet Capital Senior Associate Devops Engineer
No ratings yet
Sukesh Kotamreddy Arrowstreet Capital Senior Associate Devops Engineer
1 page
(SMTPS!) GODADDY
No ratings yet
(SMTPS!) GODADDY
9 pages
Dmmeasycontrol Software Guide: Install Driver
No ratings yet
Dmmeasycontrol Software Guide: Install Driver
10 pages
2024 01 04T20 43 19 - R3dlog
No ratings yet
2024 01 04T20 43 19 - R3dlog
12 pages
CompTIA A Troubleshooting Techniques Contents
No ratings yet
CompTIA A Troubleshooting Techniques Contents
7 pages
Assignment2 OOS 9Qns S3MCA
No ratings yet
Assignment2 OOS 9Qns S3MCA
5 pages
Citire Variabile NC NCVAR
No ratings yet
Citire Variabile NC NCVAR
3 pages
JD - Sitecore Content Hub Engineer
No ratings yet
JD - Sitecore Content Hub Engineer
2 pages
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
From Everand
Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
Neylson Crepalde
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)

Bikash Jha CV Geospatial

Uploaded by

Bikash Jha CV Geospatial

Uploaded by

Leverkusen, Germany 51381

Bikash Jha +49 (0) 157 50325815

Data Engineer [email protected]

Profile Skills: Data Structures, Algorithm,

Homelike Internet GmbH, Cologne, Germany

Aurigo Software Technologies, Bangalore, India

Tech Stack: Linux, AWS, Kubernetes, Docker, Python, Airflow, DynamoDB

LTI (Larsen & Toubro Infotech), Pune, India

AMDOCS, Pune, India

Project: Auto-ML Product Recommendation

Tech Stack: Pyspark, Python, MapR, Kafka, Hive, Random Forest

Other Interesting Facts

You might also like