DevOps Interview Guide
DevOps Interview Guide
DevOps Interview Guide
1. What are the common day-to-day activities in DevOps when working in the cloud?
Daily activities often include managing CI/CD pipelines, monitoring system performance,
infrastructure provisioning using Infrastructure as Code (IaC) tools like Terraform, collaborating with
developers to troubleshoot build and deployment issues, implementing security protocols, managing
container orchestration with Kubernetes and Docker, and handling incident response and root cause
analysis.
2. How do you effectively use Kubernetes and Docker in your daily DevOps tasks?
Docker is used to containerize applications, ensuring they run consistently across environments.
Kubernetes orchestrates these containers, managing deployments, scaling, and ensuring high
availability. I use Kubernetes to manage rolling updates, maintain pod health, and automate scaling.
Additionally, ConfigMaps, Secrets, and networking policies are used for better configuration and
security management.
Exit codes are numerical codes returned by a process indicating its termination status. Common exit
codes include:
- 0: Successful execution.
- 1: General error.
- 137: Process killed (often due to memory constraints).
- 139: Segmentation fault.
Steps include:
- Describe the pod using `kubectl describe pod <pod-name>` for error messages.
- Check node status with `kubectl get nodes` and `kubectl describe node <node-name>`.
- Verify resource constraints, affinity rules, and taints/tolerations.
- Review events and check for scheduling issues (e.g., insufficient resources).
8. Can you describe your experience with incident management and how you categorize
incident priority?
Yes, I have experience handling incidents using ITIL frameworks. Prioritization is based on:
- P1 (Critical): Major disruption with no workaround.
- P2 (High): Significant impact with a potential workaround.
- P3 (Medium): Limited impact, minor inconvenience.
- P4 (Low): Cosmetic or minor issue.
9. How do Prometheus and Grafana interact, and what is Prometheus' data source?
Prometheus scrapes metrics from monitored targets using exporters. Grafana connects to
Prometheus as a data source, querying and visualizing the collected metrics to create dashboards.
11. How do you enable internal communication between multiple AWS accounts?
Options include:
- AWS Transit Gateway for scalable, hub-and-spoke VPC connectivity.
- VPC Peering for direct VPC communication.
- Shared VPCs and AWS Organizations for centralized resource sharing.
12. What is the difference between IR (Incident Report) and SR (Service Request)?
13. Do you have experience with monitoring? What tools have you used?
Yes, I have extensive experience with tools like Prometheus, Grafana, ELK Stack, and AWS
CloudWatch for real-time monitoring, visualization, and alerting.
Setting up monitoring dashboards, defining alerts, analyzing logs, creating custom metrics, and
ensuring systems meet performance and availability SLAs.
dashboard?
The first step is to identify the resource causing the spike (CPU, memory). Depending on findings, I
would:
- Scale resources or pods.
- Optimize the application or load balancing.
- Redistribute workloads across nodes.
Yes, I have created custom monitors and alerts in Prometheus, CloudWatch, and other tools for
tracking key metrics and ensuring system reliability.
18. What is a sidecar container in Kubernetes, and what are its use cases?
A sidecar container runs alongside the main application container within the same pod, providing
supplementary functionality like logging, monitoring, or proxying. Use cases include service meshes,
log aggregators, and backup agents.
19. Do you have experience with Infrastructure as Code (IaC) tools like Terraform?
Yes, I have experience using Terraform for defining and provisioning cloud infrastructure through
declarative code, enabling version-controlled, automated deployments.
FROM nginx:latest
COPY ./index.html /usr/share/nginx/html/index.html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Testing in DevOps includes unit, integration, and end-to-end testing integrated into CI/CD pipelines.
Tools like JUnit, Selenium, and JMeter ensure code quality, performance, and functionality are
maintained during the deployment process.