Repl333 2
Repl333 2
Beginner to advance
Giving some Context
Good to haves
Pre requisites
1. Docker / Containerization
1. Basic Coding (loops, if else, variables) 2. Kubernetes
2. Node.js 3. AWS ASGs
Basic Advance
1. Backend communication 1. Kubernetes
2. Docker / Containerization 2. Pseudo Terminals
3. Isolated environments 3. Nix
4. Remote code execution
5. repl.it system design/architecture
Before we start - Disclaimer
We’ll be taking 3 approaches to solve this problem
Callout -
You can use an external service/codebase that does
a lot of this for you.
If you’re a startup, you probably want to
pick one of these services and not build this from scratch
https://github.com/coder/code-server
(Github codespaces)
Beginner friendly
Great way to build intuition on how you can build something like this
DO NOT use it in production/in an interview
Downsides
1. Insecure Remote code execution
2. Single server setup, doesn’t autoscale
3. Port con icts between two users (every user is sharing resources on the same server)
4. Terminal is extremely rudimentary
5. Very ugly package management
fl
Why is building repl.it hard?
Why is building repl.it hard?
repl.it server
Execution service
Execute
AWS S3
Execution service
Step 0.2 - Bringing all languages you support (Node, Go, Rust) to this machine
Install node
Install rust
Install Golang
…
Execution service
Step 1 - Initialising the repl
Execution service
Step 1 - Initialising the repl
Execute
Execution service
Step 1 - Initialising the repl
Copy over the base image to
s3://images/{id}
Execute
Execution service
Step 2 - Taking the user to the edit screen
Execution service
Step 3 - Initialise a ws connection
Websocket
Execute
Execution service
Step 4 -Bring the users code to the VM
S3
S3
S3
Websocket Execute
fi
Execution service
Step 5 - Let the user edit les
Websocket Execute
fi
Execution service
Step 5 - Let the user edit les
Websocket Execute
fi
Execution service
Step 5 - Let the user edit les
Websocket
S3
Callouts - Debounce these saves
You can mount a directory to S3 as well, although need to make sure node_modules don’t reach S3
fi
Execution service
Logic to add and delete les also remains the same!
Validation of les ( le format, size) is something you should take into consideration
Websocket Execute
S3
fi
fi
fi
Execution service
Step 6 -Running/Executing the code
3000
npm
run dev
Streaming logs
Execution service
3000
npm
run dev
Execute
Streaming logs
Execution service
Disconnects
Execute
Clean up resources
1. Wait for a bit before
removing the folder
2. Flush to S3
3. Stop any lingering
process
Disclaimers
1. I’m using Node.js. Keep it simple
2. I’m using socket.io. Keep it simple
3. I’ll be writing code in TS, but nothing too strict. Keep it simple
4. I will not be adding any extra u that’s not needed for this tutorial (eslint, prettier).
5. No monorepos - code repetition
Should you create a zig based well linted 100% tested CI/CD implemented system?
Maybe
https://github.com/xtermjs/xterm.js
https://github.com/replit/ruspty
Introducing PTY
Old approach
Stream logs
Introducing PTY
New approach
PTY
Execute
New approach
PTY
Stream
Execute
Stream
Part 2 | The good solution
Part 2 | The good solution
2 cpu
Server 1
10 GB
Browser 1
2 cpu
Server 2
Browser 2 10 GB
Browser 3
2 cpu
Server 3
10 GB
fi
Cloud speci c Autoscaling
2 cpu
Server 1
10 GB
Browser 1
2 cpu
Server 2
Browser 2 10 GB
Browser 3
2 cpu
Server 3
10 GB
Browser 4
2 cpu
Server 4
10 GB
fi
Cloud speci c Autoscaling
2 cpu
Server 1
10 GB
Upsides
1. Easy to do
Browser 1
2. Provides you a way to securely run code
3. Autoscales
4. No port con icts
2 cpu
Server 2
Browser 2 10 GB
Downsides Browser 3
1. Bootup time (not a huge problem) 2 cpu
Server 3
2. Over provisioned servers 10 GB
3. Not cloud agnostic
Browser 4
2 cpu
Server 4
10 GB
fl
fi
Kubernetes
Kubernetes
Mac
Containers
Mac
Filesystem Filesystem
Filesystem Filesystem
Mac
Mac
Network Network
Network Network
Mac Linux
React React
Windows Ubuntu
Node
fi
Container Orchestration - Kubernetes
Windows Ubuntu
Node
React
fi
Container Orchestration - Kubernetes
1. Nodes
Container Orchestration - Kubernetes
Node
React
Container Orchestration - Kubernetes
1. Nodes
2. Pods
3. Services
4. Ingress
Container Orchestration - Kubernetes
131.44.11.22
Service Node
Container Orchestration - Kubernetes
Node
A service can also load balance across pods
44.2.11.3
Service
Node
Container Orchestration - Kubernetes
Approach #1 Approach #2
44.2.11.3
Service Node Node
44.2.11.3
Service
1.33.14.1
Service React React
Container Orchestration - Kubernetes
React
Container Orchestration - Kubernetes
Service Node
pod1.repl.it
Service
React
Container Orchestration - Kubernetes
Ingress
Service Node
pod1.repl.it
pod1.repl.it
Ingress
Controller
pod2.repl.it
pod2.repl.it
Service
React
Container Orchestration - Kubernetes
Node
pod1.repl.it Service
Ingress
pod1.repl.it
Ingress
Controller
pod2.repl.it
Ingress
Service
pod2.repl.it
React
Container Orchestration - Kubernetes
Node
pod1.repl.it Service
Ingress
pod1.repl.it
Service Pod
pod2.repl.it
Ingress
Service
pod2.repl.it
React
Container Orchestration - Kubernetes
They can exist on di erent nodes as well
As long as they are in the same cluster, Ingress controller should be able to route tra c
Service Node
pod1.repl.it
Ingress
pod1.repl.it
Service Pod
pod2.repl.it
Ingress
Service
pod2.repl.it
React
ff
ffi
Container Orchestration - Kubernetes
Given all this information, can you guess the nal architecture?
fi
Container Orchestration - Kubernetes
Step 1 - Start a k8s cluster, set some autoscaling policies on the nodes
Container Orchestration - Kubernetes
Service Pod
Container Orchestration - Kubernetes
pod1.repl.it
Service Pod
pod2.repl.it
Container Orchestration - Kubernetes
Service Node
pod1.repl.it
Ingress
pod1.repl.it
Service Pod
pod2.repl.it
Container Orchestration - Kubernetes
Step 4 - As people start repls, start a
pod, service and ingress for them
Service Node
pod1.repl.it
Ingress
pod1.repl.it
Service Pod
pod2.repl.it
Ingress
Service
pod2.repl.it
React
Container Orchestration - Kubernetes
Step 4 - As people leave repls, stop the respective
pod, service and ingress
pod1.repl.it
Service Pod
pod2.repl.it
Ingress
Service
pod2.repl.it
React
Container Orchestration - Kubernetes
React
pod1.repl.it Service
Service Pod
pod2.repl.it
Ingress
pod2.repl.it
Container Orchestration - Kubernetes
My proposed solution
Part 2 | The good solution
We have 3 services, and a k8s cluster
32 cpus 32 cpus
100gb 100gb
Runner ws server
1. Simple HTTP API
Step 1 - Initialising the repl
Copy over the base image to
s3://images/{id}
3000
npm
run dev
Execute
Runner ws server
Ws connection
What happens after the user starts the repl?
We need to start an independent runner for them
While it starts, the user sees the loading screen
3. Orchestrator
Step 3 -Tell the orchestrator to start a pod
http
Orchestrator
(http or ws)
Orchestrator
Step 3 -Tell the orchestrator to start a pod
Runner
S3
Websocket/http
Orchestrator
(http or ws)
Execution service
Step 3 -Tell the orchestrator to start a pod
Tells it to pull the code from S3
Runner
S3
Websocket/http
Orchestrator
(http or ws)
runner_addr
Token
Callout -
1. Caching is super helpful here
2. You can maintain a warm pool of pods that
you can auto assign immediately
Execution service
Step 3 -Tell the orchestrator to start a pod
runner_addr Ingress
Token
Websocket
Runner
Callout -
1. Caching is super helpful here
2. You can maintain a warm pool of pods that
you can auto assign immediately
Execution service
Step 4 - Let the user edit a le, send over di
over the ws layer
runner_addr
Token
Runner
Websocket
fi
ff
Execution service
Step 4 - Let the user edit a le, send over di
over the ws layer
runner_addr
Token
Runner
Websocket
fi
ff
Execution service
Step 4 - Let the user edit a le, send over di
over the ws layer
runner_addr
Token
Runner
Websocket
S3
fi
ff
Execution service
Step 5 - Terminal access
startSession
Runner
Websocket
Execution service
Step 5 - Terminal access
startSession
Runner
Websocket
Execution service
Step 5 - Terminal access
Relay keystrokes/commands
Runner
Execution service
Step 6 - Accessing a process
Ingress
3000
Runner
Execution service
Step 7 - Destroying the pod
If the process has 0 ws conns alive for
~5 minutes, it can kill itself.
Ingress
3000
Runner
Execution service
3000 3000
Runner Runner
Few callouts that might feel like good to haves
1. User can inspect your ws codebase (not good). They can stop this process as well.
Fixing this might involve starting a parent process which takes inputs from the user
and forwards it to the container through a socket
2. User can still reboot your machine (can do the same on replit)
Few advance things that repl.it does