Machine Learning Engineer with 5 years of experience helping companies bring ML projects to production through robust MLOps practices.
ML, NLP, GenAI, PyTorch, AWS Bedrock, HuggingFace, ONNX
W&B Weave, Skypilot, DVC, BentoML, ClearML, Mlflow
Docker, Gitlab CI, Kubernetes, Helm, Argo CD, Terraform
Lambda, Step Functions, Batch, EKS, ECS, Sagemaker, S3, Bedrock
Python expert, Django, Celery, FastAPI, uv
Streamlit, Grafana, Kibana, Tableau
Temporal, Dagster, Airflow, Hadoop, Spark, Snowflake
I'm a Machine Learning Engineer with a strong mathematical background. My path to Data Science began when I realized the vast potential of applying mathematical concepts to real business challenges.
I sharpened my skills at Ubisoft, where I learned to manage end-to-end ML projects, diving deep into Infrastructure, Data Engineering and MLOps.
Then at GitGuardian, I fine-tuned Transformer models for code security and built the MLOps stacks of the company from scratch.
After these experiences, I wanted to try the freelance path to continue growing in ownership and expand my skills. I joined Sanofi to work on GenAI challenges: building scalable Unstructured Data parsing pipelines, benchmarking Vision-Language Models, and making unstructured knowledge accessible at scale.
In my free time, I'm passionate about Football and data analytics. I built The Scouting Arena, a platform helping football fans discover and scout players using advanced analytics.
If you want to have a talk, reach out on Linkedin or at [email protected].
- Development of an Unstructured Data Pipeline (OCR+VLM with Docling, AWS Textract, Bedrock VLMs, metadata generation with LLMs, chunking, vectorization), processing millions of PDFs for Sanofi teams.
- Developed an internal benchmarking framework with DVC & Weave to compare open-source OCR libraries and VLMs for unstructured data extraction.
- Define MLOps best practices for GenAI teams at Sanofi.
- Develop AI agents for employee companion use case with LangGraph.
- Built the MLOps stack from scratch: GitLab CI, SkyPilot, DVC pipelines, Dagster jobs, Streamlit, ONNX Runtime, BentoML, Helm, ArgoCD.
- Fine-tuned and integrated NLP models (CodeBERTa) into the Secrets Detection Engine, reducing false positives by 5x (Django, Celery, Kubernetes, AWS).
- Developed PoCs on automatic remediation for leaked secrets (OpenAI API, AST parsers and code formatting).
- End-to-end Fraud Detection project in e-commerce transactions (Ubi Connect and Steam).
Led Research tasks (Feature Engineering, Semi-supervised learning) and put in place MLOps best practices (DVC, remote jobs on K8s, ClearML, model inference on AWS).
The project led to 5% of net sales savings, about 4 millions euros per year, compared to the previous fraud detection product.
- Time Series forecasting on Acquisition, Retention, Monetization and Ubisoft servers vCPU usage. Trained and deployed Generalized Additive Models to improve forecasts.