0% found this document useful (0 votes)
34 views

Kubernetes Error

The document discusses 25 common Kubernetes errors and provides troubleshooting steps and example commands to diagnose and resolve each issue. The errors covered include unable to connect to cluster, pods stuck in pending state, insufficient resources, image pull failures, crash loops, unauthorized access, configmaps not updating, unreachable services, unready nodes, pending PVCs, faulty volume mounts, restrictive pod security policies, incorrect service account permissions, non-matching node selectors, misconfigured ingress, unable to scale deployments, malfunctioning custom resources, pods stuck in terminating state, exceeded resource quotas, rolling updates not progressing, problematic node draining/cording, resource creation timeouts, pods stuck in container creating, invalid YAML syntax, and etcd

Uploaded by

satya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Kubernetes Error

The document discusses 25 common Kubernetes errors and provides troubleshooting steps and example commands to diagnose and resolve each issue. The errors covered include unable to connect to cluster, pods stuck in pending state, insufficient resources, image pull failures, crash loops, unauthorized access, configmaps not updating, unreachable services, unready nodes, pending PVCs, faulty volume mounts, restrictive pod security policies, incorrect service account permissions, non-matching node selectors, misconfigured ingress, unable to scale deployments, malfunctioning custom resources, pods stuck in terminating state, exceeded resource quotas, rolling updates not progressing, problematic node draining/cording, resource creation timeouts, pods stuck in container creating, invalid YAML syntax, and etcd

Uploaded by

satya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Troubleshooting in Kubernetes

By DevOps Shack
25 Examples With Commands

1. Error: Unable to connect to the cluster

o Troubleshooting:
▪ Check kubeconfig file for correct cluster information.
▪ Verify network connectivity to the cluster.
o Example Commands:

kubectl config view


kubectl cluster-info

2. Error: Pod stuck in Pending state

o Troubleshooting:
▪ Check events for the pod using kubectl describe pod.
▪ Inspect the pod's YAML for resource constraints or affinity
issues.
o Example Commands:

kubectl describe pod <pod-name>


kubectl get events --namespace <namespace>

3. Error: Insufficient resources to schedule pod

Troubleshooting:

▪ Check resource requests and limits in the pod specification.


▪ Verify node resources using kubectl describe node.
o Example Commands:

kubectl describe pod <pod-name>


kubectl describe node <node-name>

4. Error: ImagePullBackOff

o Troubleshooting:
▪ Verify the image name and availability.
▪ Check image pull credentials using kubectl describe pod.
o Example Commands:

kubectl describe pod <pod-name>


kubectl get pods --namespace <namespace> -
o=jsonpath='{.items[*].status.containerStatuses[*].state}'

5. Error: CrashLoopBackOff

o Troubleshooting:
▪ Check container logs for details on the crash.
▪ Inspect pod events using kubectl describe pod.
o Example Commands:

kubectl logs <pod-name> <container-name>


kubectl describe pod <pod-name>

6. Error: Unauthorized access

o Troubleshooting:
▪ Verify RBAC permissions for the user.
▪ Check kubeconfig for correct credentials.
o Example Commands:

kubectl auth can-i --list


kubectl config view

7. Error: ConfigMap not updating in the pod

o Troubleshooting:
▪ Check if the ConfigMap is updated.
▪ Verify that the pod is configured to use the latest version.
Example Commands:

kubectl get configmap <configmap-name> -o yaml


kubectl describe pod <pod-name>

8. Error: Service not reachable

o Troubleshooting:
▪ Check service endpoints using kubectl describe service.
▪ Verify network policies and firewall rules.
o Example Commands:

kubectl describe service <service-name>


kubectl get networkpolicies

9. Error: Node not ready

o Troubleshooting:
▪ Check node status with kubectl get nodes.
▪ Review kubelet logs on the node for issues.
o Example Commands:

kubectl get nodes


kubectl describe node <node-name>

10. Error: PersistentVolumeClaim (PVC) pending

o Troubleshooting:
▪ Verify available storage in the cluster.
▪ Check storage class and provisioner.
o Example Commands:

kubectl get pvc


kubectl describe storageclass

11. Error: VolumeMounts not working in pod

o Troubleshooting:
▪ Check pod's YAML for correct volume mounts.
▪ Verify if the volume exists and is accessible.
o Example Commands:

kubectl describe pod <pod-name>


kubectl get pv

12. Error: Pod Security Policies (PSP) blocking pod

o Troubleshooting:
▪ Check PSP rules and RBAC for the pod.
▪ Inspect pod events using kubectl describe pod.
o Example Commands:

kubectl get psp


kubectl describe pod <pod-name>

13. Error: ServiceAccount permissions

o Troubleshooting:
▪ Verify ServiceAccount permissions using kubectl auth can-i.
▪ Check RBAC roles and role bindings.
o Example Commands:

kubectl auth can-i --list --


as=system:serviceaccount:<namespace>:<serviceaccount-name>

kubectl get roles,rolebindings --namespace <namespace>

14. Error: NodeSelector not working

o Troubleshooting:
▪ Check pod's YAML for correct node selector.
▪ Verify that nodes have the required labels.
o Example Commands:

kubectl describe pod <pod-name>


kubectl get nodes --show-labels

15. Error: Ingress not routing traffic

o Troubleshooting:
▪ Check Ingress resource for correct backend services.
▪ Verify that the Ingress controller is running.
o Example Commands:

kubectl describe ingress <ingress-name>


kubectl get pods --namespace <ingress-controller-namespace>

16. Error: Unable to scale deployment

o Troubleshooting:
▪ Verify available resources in the cluster.
▪ Check replica count in the deployment specification.
o Example Commands:

kubectl get deployments


kubectl describe deployment <deployment-name>

17. Error: Custom Resource Definition (CRD) not creating resources

o Troubleshooting:
▪ Check CRD definition for correct syntax.
▪ Verify controller logs for errors.
o Example Commands:

kubectl get crd


kubectl describe crd <crd-name>

18. Error: Pod in Terminating state

o Troubleshooting:
▪ Check for stuck finalizers in pod metadata.
▪ Force delete pod using kubectl delete pod --grace-period=0.
o Example Commands:

kubectl get pods --all-namespaces --field-


selector=status.phase=Terminating
kubectl delete pod <pod-name> --grace-period=0 –force
19. Error: Resource quota exceeded

o Troubleshooting:
▪ Check resource quotas for the namespace.
▪ Verify resource usage in the namespace.
o Example Commands:

kubectl describe quota --namespace <namespace>


kubectl top pods --namespace <namespace>

20. Error: Rolling update stuck or not progressing

o Troubleshooting:
▪ Check rollout status using kubectl rollout status.
▪ Verify image versions in the deployment.
o Example Commands:

kubectl rollout status deployment <deployment-name>

kubectl set image deployment/<deployment-name> <container-


name>=<new-image>

21. Error: Node draining or cordoning

o Troubleshooting:
▪ Check node conditions and events.
▪ Use kubectl drain with caution.
o Example Commands:

kubectl get nodes


kubectl describe node <node-name>
kubectl drain <node-name> --ignore-daemonsets

22. Error: Resource creation timeout

o Troubleshooting:
▪ Check for issues with the API server.
▪ Verify network connectivity to the API server.
o Example Commands:

kubectl get events --sort-by='.metadata.creationTimestamp'


kubectl describe pod <pod-name>
23. Error: Pod stuck in ContainerCreating state

o Troubleshooting:
▪ Check container runtime logs on the node.
▪ Inspect kubelet logs for errors.
o Example Commands:

kubectl get pods


kubectl describe pod <pod-name>

24. Error: Invalid YAML syntax

o Troubleshooting:
▪ Validate YAML syntax using online tools or linters.
▪ Check for indentation and formatting issues.
o Example Commands:

kubectl apply -f <file.yaml> --dry-run=client

25. Error: etcd cluster issues

o Troubleshooting:
▪ Check etcd logs for errors.
▪ Verify etcd cluster health.
o Example Commands:

kubectl get events --all-namespaces --field-


selector=involvedObject.kind=Pod,involvedObject.name=etcd

kubectl exec -it etcd-pod-name --namespace kube-system -- sh


etcdctl member list
etcdctl cluster-health

Remember to replace placeholders like <pod-name>, <namespace>, <deployment-name>,


etc., with actual values specific to your environment. Additionally, exercise caution
when using force deletion or draining nodes to avoid potential data loss or service
disruption.

You might also like