K8s启动pop后无法查看到pod故障排查
故障现象
- 启动pod后kubectl get pod 无法看到pod
kubectl apply -f app.yaml
排查
Kubernetes Deployment 实际上是一种更高级别的资源,它使用其他 Kubernetes 资源来创建 Pod。这种复杂性的原因是因为这种Deployment
类型向较低级别的资源添加了功能。该博客将指导您查看您的Deployment
以及如何查找有关它的信息。这些步骤对所有人都是通用的Deployments
。
当您创建一个Deployment
种类时,您将看到它是通过运行以下命令创建的;
$ kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
cost-attribution-grafana 1 1 1 1 2m18s
您可以描述它以查看它做了什么:
$ kubectl describe deploy cost-attribution-mk-agent
Name: cost-attribution-mk-agent
Namespace: kubernetes-cost-attribution
CreationTimestamp: Wed, 21 Nov 2018 12:30:47 -0800
Labels: app=cost-attribution-mk-agent
Annotations: deployment.kubernetes.io/revision: 1
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"cost-attribution-mk-agent","namespace":"kubernetes-cost-a...
Selector: app=cost-attribution-mk-agent
Replicas: 1 desired | 0 updated | 0 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=cost-attribution-mk-agent
Service Account: cost-attribution-kube-state-metric
Containers:
mk-agent:
Image: gcr.io/managedkube/kubernetes-cost-attribution/agent:1.0
Port: 9101/TCP
Host Port: 0/TCP
Limits:
cpu: 500m
memory: 500Mi
Requests:
cpu: 20m
memory: 20Mi
Liveness: http-get http://:9101/metrics delay=5s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get http://:9101/metrics delay=5s timeout=5s period=5s #success=1 #failure=3
Environment: <none>
Mounts: <none>
Volumes:
ubbagent-state:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetCreated
Available False MinimumReplicasUnavailable
ReplicaFailure True FailedCreate
OldReplicaSets: <none>
NewReplicaSet: cost-attribution-mk-agent-6c78b8757f (0/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 2m27s deployment-controller Scaled up replica set cost-attribution-mk-agent-6c78b8757f to 1
在该Events
部分中,有一个事件将 a 扩展ReplicaSet
到 1。这些事件消息对于调试您的 Deployment 至关重要。这里可能还有其他失败案例,它会描述(或至少给你一个线索)失败的原因,这样你就可以补救了。
即使确实Deployment
创造了那ReplicaSet
并不意味着有Pods
创造。流程的下一步是ReplicaSet
通过运行以下命令查看资源:
$ kubectl get replicaset
NAME DESIRED CURRENT READY AGE
cost-attribution-grafana-bfdfddcbb 1 1 1 2m33s
这将向您显示ReplicaSets
您在此命名空间中拥有的。有了这个,您可以描述它ReplicaSet
以查看它做了什么:
$ kubectl describe replicaset cost-attribution-mk-agent-6c78b8757f
Name: cost-attribution-mk-agent-6c78b8757f
Namespace: kubernetes-cost-attribution
Selector: app=cost-attribution-mk-agent,pod-template-hash=2734643139
Labels: app=cost-attribution-mk-agent
pod-template-hash=2734643139
Annotations: deployment.kubernetes.io/desired-replicas: 1
deployment.kubernetes.io/max-replicas: 2
deployment.kubernetes.io/revision: 1
Controlled By: Deployment/cost-attribution-mk-agent
Replicas: 0 current / 1 desired
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=cost-attribution-mk-agent
pod-template-hash=2734643139
Service Account: cost-attribution-kube-state-metric
Containers:
mk-agent:
Image: gcr.io/managedkube/kubernetes-cost-attribution/agent:1.0
Port: 9101/TCP
Host Port: 0/TCP
Limits:
cpu: 500m
memory: 500Mi
Requests:
cpu: 20m
memory: 20Mi
Liveness: http-get http://:9101/metrics delay=5s timeout=5s period=10s #success=1 #failure=3
Readiness: http-get http://:9101/metrics delay=5s timeout=5s period=5s #success=1 #failure=3
Environment: <none>
Mounts: <none>
Volumes:
ubbagent-state:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
Conditions:
Type Status Reason
---- ------ ------
ReplicaFailure True FailedCreate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 76s (x15 over 2m38s) replicaset-controller Error creating: pods "cost-attribution-mk-agent-6c78b8757f-" is forbidden: error looking up service account kubernetes-cost-attribution/cost-attribution-kube-state-metric: serviceaccount "cost-attribution-kube-state-metric" not found
在这种特殊情况下,事件报告一个FailedCreate
. 这里的具体原因是没有找到引用的服务账号Pod
。不过,您的特定错误可能有所不同。这只是一个例子。
结论
该博客引导您针对特定案例跟踪您的 Deployment,但此处概述的步骤是通用的,适用于Deployment
如果它没有表现或创建您期望它创建的 Pod,您将如何查看您的 Deployment。