GitLab CI/CD configuration to automate the deployment, management, and teardown of GPU-powered pods on RunPod.io for running Large Language Models with vLLM
- Automated Deployment: Creates RunPod GPU pods via the RunPod API based on configurable parameters.
- LLM Ready: Configures pods to run LLMs using the vLLM framework.
- CI/CD Stages: Includes stages for deploying, listing, stopping, starting, and destroying pods directly from GitLab CI/CD.
- Configuration via Variables: Uses GitLab CI/CD variables for easy customization of pod specifications (GPU type, model, disk space, etc.) and secrets (API keys, tokens).
- Information Output: Provides detailed information about the created pod (cost, location, specs) during deployment.
- GitLab Account & Project: You need a GitLab instance and a project to host this
.gitlab-ci.ymlfile. - RunPod Account: An active account on RunPod.io.
- RunPod API Key: A full-rights API key from your RunPod account.
- Hugging Face Token: A token from Hugging Face (required if using gated LLM models).
- SSH Keys (Optional): For direct SSH access to the pod, if needed.
-
Add Variables to GitLab CI/CD: In your GitLab project, navigate to
Settings->CI/CD->Variables. Add the following variables:RUNPOD_API_KEY: Your RunPod API key (Masked and Protected recommended).HG_TOKEN: Your Hugging Face token (Masked and Protected recommended).RUNPOD_SSH_PUBLIC_KEYS: Your public SSH keys for pod access (if applicable, separated by\nif multiple).RUNPOD_SSH_PRIVATE_KEY: Your private SSH key for GitLab Runner to connect to the pod (Masked and Protected recommended).
-
Configure
.gitlab-ci.yml: Review thevariablessection in the provided.gitlab-ci.ymlfile. Adjust default values likeHG_LLM,RUNPOD_GPU_TYPE,RUNPOD_TEMPLATE, etc., as needed for your use case. Ensure the specified template (RUNPOD_TEMPLATE) or image (RUNPOD_IMAGE) supports your target LLM and GPU requirements. -
Runner Tags: Ensure your GitLab Runner has the tags
gpuandrunpod(or whatever tags you specify in the.gitlab-ci.yml) to pick up these jobs.
- Deploy a Pod: Manually trigger the
deploy_podjob in your GitLab CI/CD pipeline. Monitor the logs for pod creation details and the RunPod console for deployment status. - List Pods: Manually trigger the
list_all_podsjob to see the status and details of existing pods. - Stop/Start: Manually trigger the
stop_podorstart_podjobs as needed. - Destroy a Pod: Manually trigger the
destroy_podjob to terminate the pod and stop incurring charges. This is crucial for cost management.
- Costs: Running GPU pods incurs costs on RunPod.io. Always destroy pods when not in use.
- Manual Jobs: All stages are set to
when: manualfor explicit control. Trigger them as needed. - Gated Models: Ensure you have access to the specified Hugging Face model and have set the
HG_TOKENvariable if the model is gated. - Security: Store sensitive information like API keys and tokens securely using GitLab CI/CD variables.
This project is licensed under the MIT License.