0% found this document useful (0 votes)
25 views19 pages

Fun Stories From Iterating Engineering Platform

- The engineering team at a company developed a platform to enable scalability as the team and services grew from 230 engineers supporting 1000 services to 400+ engineers supporting 2500+ services. - The initial platform provided developer control, integration and delivery, and monitoring planes. It utilized tools like GitLab CI, GitHub Actions, Kubernetes, Prometheus and more. - Over time, the platform was evaluated through surveys, tickets, interviews and monitoring to validate solutions, discover problems and measure adoption as the needs of engineers varied across teams. - Lessons learned included catering to various workflows across teams and making the platform open to third parties to provide addons through an API while focusing on usability, collaboration and enabling task

Uploaded by

Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views19 pages

Fun Stories From Iterating Engineering Platform

- The engineering team at a company developed a platform to enable scalability as the team and services grew from 230 engineers supporting 1000 services to 400+ engineers supporting 2500+ services. - The initial platform provided developer control, integration and delivery, and monitoring planes. It utilized tools like GitLab CI, GitHub Actions, Kubernetes, Prometheus and more. - Over time, the platform was evaluated through surveys, tickets, interviews and monitoring to validate solutions, discover problems and measure adoption as the needs of engineers varied across teams. - Lessons learned included catering to various workflows across teams and making the platform open to third parties to provide addons through an API while focusing on usability, collaboration and enabling task

Uploaded by

Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Fun Stories from Iterating

Engineering Platform
Adityo Pratomo & Joshua Bezaleel Abednego
@kotakmakan @joshuabezaleel
How did we start?
Initial problem

How to enable engineering


scalability?
Initial problem
Initial problem

How to enable engineering


scalability?

A system that enables product engineers


to correctly operate their services with
minimal manual intervention

Ref: https://gigamonkeys.com/flowers/
State of the platform
ref: mckinsey, dev platform ref architecture

Developer
Control UI Portal App Config Components
Plane
v

Integration Resource Plane


Image
& Delivery CI Pipeline CD Pipeline
Registry
Plane Google Kubernetes Engine
Platform
GitLab CI GitHub Orchestrator
Runner Action Runner ArgoCD Elastic Kubernetes Service

Istio Service Mesh


Monitoring
ELK Log Prometheus
& Logging VictoriaMetrics Cost Insight
Infra AlertManager
Plane Postgres & Redis

Security Teleport Kafka Cluster


App Secret
Plane Access
State of the platform
- Engineers (230 → 400++)
- Clusters (30 → 55++)
- Services (1000 → ~2500)
- Team (30 → 50++)
- Deployments/week (3000 → 6000++)

Product Engineers Product Engineers Product Engineers Product Engineers


Infrastructure Team
Evaluating the platform
Capturing different needs

survey ticket analysis user interview event monitoring

validating solution discovering problems measuring adoption


Modelling developer workflow

1. Discover: gathering information required


to start writing code
2. Write: translating requirements into code
3. Test: testing the recently written code to
ensure that it fits the requirement
4. Integrate: build and deploy the code, to
make it integrated with the wider system
5. Verify: testing the recently integrated
code, to ensure the functionality
correctness
A platform will play huge part in Discover,
Integrate and Verify step. As a product, we
need to constantly evaluate its contribution to
these phases.
One metric to rule them all

scalability is a function of efficiency

● The sooner we can bring engineers


back to writing mode, the better
● Any manual intervention adds up to
inefficiency of the system and
workflow
● We track number of tickets that
indicates how much self service
capability are we providing
● Also, we track customer
satisfaction on yearly basis
Lessons we’ve learned
Catering to various workflows

Team
Team

Sub-team Sub-team Sub-team


Sub-team Sub-team Sub-team

Team IT GRC

Sub-team Sub-team Sub-team


Security
Platform
engineering
Making an open platform
Third-party
Providers
Redis
provisioned
Product Addons
UI Portal Marketplace
Engineers Database
provisioned

OSB API Disk resized


CI

Platform DB partitioned
Orchestrator automatically
deployment.yaml
gateway.yaml Logging TPS
Infra configured
Engineers metrics.yaml
optimize-cost.yaml
Domain
exposed

ref: https://github.com/openservicebrokerapi
Platform engineering design principles

✅ �� �� ⛓
only abstracts
provides best open for enables task
what you
default collaboration delegation
need
easy for work closely with expose easily enables chain of
engineers to users, understood delegation that
correctly operate contributors and configurations, translates into
services stakeholders to following actionables that
ensure consistent engineers mental can be efficiently
value delivery model completed
Where did we get lucky?

Gopay.sh
Gopay.sh
Engineering
Engineering
Stakeholder
Platform
Platform
buy in

Infra team Collaborative


product engineering
thinking culture
Your engineering platform will grow
together as the company grows.
Constantly evolve it to cover various
use cases in order to keep engineers
productive and the infra reliable, is a
key to keep the platform valuable to
the business.
Thank You!

You might also like