This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

feat(executor): add option for additional mounts in docker runner #56434

Open

rdeusser wants to merge 1 commit into sourcegraph:main from rdeusser:main

rdeusser commented Sep 7, 2023 •

edited

Loading

Description

We use the code intel feature on a lot of Go code and each time a repo was indexed it had to repopulate the Go module cache all over again, wasting time and unnecessarily putting pressure on our Artifactory instance. To solve this, I added the ability to specify mounts, semicolon-separated, through the environment variable EXECUTOR_DOCKER_ADDITIONAL_MOUNTS (e.g. export EXECUTOR_DOCKER_ADDITIONAL_MOUNTS=type=volume,source=gocache,target=/gocache;type=volume,source=gomodcache,target=/gomodcache). To fully solve the issue, it was necessary to add the environment variables GOCACHE and GOMODCACHE to the executor secrets under the Code Graph tab in the Sourcegraph UI as well as add those environment variables to the requested_envvars array in the Code Graph inference section.

Test plan

Tested in production environment.

Screenshot showing the volumes are referenced in the docker runner started by the executor:

Screenshot showing that the /gomodcache directory in the gomodcache volume is populated:

On one indexing job we saw an improvement of 92.4%.

cla-bot bot commented Sep 7, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser force-pushed the main branch from 33aa5dd to 6a80cf0 Compare

September 7, 2023 19:40

cla-bot bot commented Sep 7, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser changed the title ~~feat(executor): add option for additional bind mounts in docker runner~~ WIP feat(executor): add option for additional bind mounts in docker runner

rdeusser force-pushed the main branch from 6a80cf0 to b114986 Compare

September 8, 2023 22:09

cla-bot bot commented Sep 8, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser changed the title ~~WIP feat(executor): add option for additional bind mounts in docker runner~~ WIP feat(executor): add option for additional mounts in docker runner

rdeusser force-pushed the main branch from b114986 to cc76915 Compare

September 11, 2023 15:41

cla-bot bot commented Sep 11, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser force-pushed the main branch from cc76915 to d21351c Compare

September 11, 2023 15:41

cla-bot bot commented Sep 11, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser force-pushed the main branch from d21351c to a743d4f Compare

September 11, 2023 15:42

cla-bot bot commented Sep 11, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

rdeusser changed the title ~~WIP feat(executor): add option for additional mounts in docker runner~~ feat(executor): add option for additional mounts in docker runner

rdeusser force-pushed the main branch from a743d4f to 7a2e2c2 Compare

September 11, 2023 17:09

cla-bot bot commented Sep 11, 2023

We require contributors to sign our Contributor License Agreement (CLA), and we don't have yours on file. In order for us to review and merge your code, please sign CLA to get yourself added.

Sourcegraph teammates should refer to Accepting contributions for guidance.

Strum355 requested a review from eseliger

September 21, 2023 14:22

rdeusser force-pushed the main branch from 7a2e2c2 to 60ef6a9 Compare

September 21, 2023 14:23

cla-bot bot added the cla-signed label

rdeusser force-pushed the main branch 3 times, most recently from 0e07128 to c1d4102 Compare

September 25, 2023 17:57

rdeusser force-pushed the main branch from c1d4102 to 1278546 Compare

September 28, 2023 16:13

varungandhi-src mentioned this pull request

executors: Inject Certs and Env Vars #53500

Open

rdeusser force-pushed the main branch from 1278546 to 1c5ab8e Compare

October 4, 2023 16:41

camdencheek added the executors label

Member

camdencheek commented Oct 11, 2023

Hi @rdeusser! Sorry about the delay on the review. Ownership of executors has shuffled around a bit, and this one got lost in the weeds. I've got this on my list for today 👍

camdencheek requested review from camdencheek and a team

October 11, 2023 14:59

Contributor

peterguy commented Oct 11, 2023

Adding a volume mount to Executors seems like a good way to solve persisting files, but because executors run in so many different environments, one of the main goals for executors is to decouple them from infrastructure. Adding volume mounts actually works against that.

Instead of modifying executors, there are a few options to solve this pain point. I'll list them in order of my preference:

Maintain a custom Docker image containing the required libraries and build artifacts, and use that as the base image. Executors support private Docker registries, and the overhead of maintaining such an image can be mitigated with scripting in the release process or CI chain.
set up a remote GOCACHE, which makes use of an experimental feature to run a sub-command that can do anything, including read from a remote cache.
If maintaining a customer Docker image or setting up a remote cache is too onerous, packing the build artifacts into one archive file, storing that file in an accessible-via-http(s) location, and adding to the job commands to retrieve and unpack the archive is another option. Kind of like a lightweight or home-brewed alternative to the first two.
Deviating from the other options considerably: instead of auto-indexing, maybe you can set up your CI/CD process to do the indexing for you. See our github workflow as an example. Not sure that’s the direction you want to move, but it is another option.

I'm sure there are other options as well; these are what the team here came up with.

I hate to rain on your parade, @rdeusser - I like the engineering work you've done in this PR - but I recommend this PR be closed and an approach that works with the existing capabilities be taken instead.


          feat(executor): add option for additional mounts in docker runner

0b703ce

rdeusser force-pushed the main branch from 740c507 to 0b703ce Compare

October 12, 2023 19:30

Author

rdeusser commented Oct 13, 2023

@peterguy In general decoupling from the infrastructure seems like a great way to support different environments more easily, but the reality is that we still have to deal with that infrastructure. Besides the fact that virtually all of the supported environments have a concept of a volume, adding volume mounts does not work against that.

This will not work at any significant scale. The Go module cache after having indexed most of our repos is over 180GB. We have over 600 Go repositories.
The Go build cache isn't the issue. The issue is the Go module cache which this doesn't apply to. So this won't work for us either.
Similar to option 1, having an 180GB archive that's retrieved on every indexing job is infeasible.
Having our CI do the indexing doesn't give us the option of having indexes for tags that have already been built. We pay for this feature so suggesting that we not use it is unacceptable.

None of the solutions provided solve the issue for us. Even moving to the Kubernetes-based executor doesn't solve the issue because the PVC's are deleted after each job is run.

Additionally, the Go module cache isn't the only problem this solves for us. In order to clone repos/go mod download them, our certificates need to be present inside the container otherwise the executor will fail when downloading dependencies. The other option is to maintain a custom scip-go image with our certificates in it. The solution for enterprise customers can't be for all of us to maintain custom scip-go/lsif-go images.

There are no approaches that I'm aware of that work with the existing capabilities, hence the purpose of the PR. The gist of what's needed here is that we need to provide a place to persistently store cached Go modules to the indexing environment prepared by the executor.

Contributor

peterguy commented Oct 19, 2023

Thanks for adding more detail @rdeusser!

I appreciate your patience; this PR, and the concept of persistent volumes in executors, has sparked quite a lengthy discussion around here. 😄

We're adding this PR to our backlog, but given our workload now, we can't give an estimate of when we'll be able to work on it.

To give you some insight into the discussions this PR has generated, here are some comments from various engineers involved in the discussion:

Storing state on the host path makes executors not stateless, and we would need to also build a way to clean up that state over time
This change would require quite a lot of testing because executors run on many different environments, so “just mount a directory from the host” is not as simple as it sounds
I don't believe it will work for Firecracker executors, nor k8s executors which feels weird to me if only one out of three runtimes support it.

Notice that much of the discussion is around supporting volume mounts in all of the deployment scenarios. Testing volume mounts in all of the installation options will take a non-trivial amount of effort; if you get a chance to test out volume mounts in Firecracker and K8s, update this PR with the result!

camdencheek removed their request for review

November 28, 2023 16:00

bahrmichael removed the request for review from a team

March 25, 2024 09:02

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed executors