本文主要基於Kubernetes1.21.9和Linux作業系統CentOS7.4。
伺服器版本 | docker軟體版本 | Kubernetes(k8s)叢集版本 | CPU架構 |
---|---|---|---|
CentOS Linux release 7.4.1708 (Core) | Docker version 20.10.12 | v1.21.9 | x86_64 |
Kubernetes叢集架構:k8scloude1作為master節點,k8scloude2,k8scloude3作為worker節點
伺服器 | 作業系統版本 | CPU架構 | 程序 | 功能描述 |
---|---|---|---|---|
k8scloude1/192.168.110.130 | CentOS Linux release 7.4.1708 (Core) | x86_64 | docker,kube-apiserver,etcd,kube-scheduler,kube-controller-manager,kubelet,kube-proxy,coredns,calico | k8s master節點 |
k8scloude2/192.168.110.129 | CentOS Linux release 7.4.1708 (Core) | x86_64 | docker,kubelet,kube-proxy,calico | k8s worker節點 |
k8scloude3/192.168.110.128 | CentOS Linux release 7.4.1708 (Core) | x86_64 | docker,kubelet,kube-proxy,calico | k8s worker節點 |
在Kubernetes中,保證應用的高可用性和穩定性非常重要。為此,Kubernetes提供了一些機制來監視容器的狀態,並自動重啟或刪除不健康的容器。其中之一就是livenessprobe探測和readinessprobe探測。
本文將介紹Kubernetes中的livenessprobe探測和readinessprobe探測,並提供範例來演示如何使用它們。
使用livenessprobe探測和readinessprobe探測的前提是已經有一套可以正常執行的Kubernetes叢集,關於Kubernetes(k8s)叢集的安裝部署,可以檢視部落格《Centos7 安裝部署Kubernetes(k8s)叢集》https://www.cnblogs.com/renshengdezheli/p/16686769.html。
Kubernetes支援三種健康檢查,它們分別是:livenessprobe, readinessprobe 和 startupprobe。這些探針可以週期性地檢查容器內的服務是否處於健康狀態。
在本文中,我們將重點介紹livenessprobe探測和readinessprobe探測。
建立存放yaml檔案的目錄和namespace
[root@k8scloude1 ~]# mkdir probe
[root@k8scloude1 ~]# kubectl create ns probe
namespace/probe created
[root@k8scloude1 ~]# kubens probe
Context "kubernetes-admin@kubernetes" modified.
Active namespace is "probe".
現在還沒有pod
[root@k8scloude1 ~]# cd probe/
[root@k8scloude1 probe]# pwd
/root/probe
[root@k8scloude1 probe]# kubectl get pod
No resources found in probe namespace.
先建立一個普通的pod,建立了一個名為liveness-exec的Pod,使用busybox映象來建立一個容器。該容器會執行args引數中的命令:touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 6000
。
[root@k8scloude1 probe]# vim pod.yaml
[root@k8scloude1 probe]# cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
#terminationGracePeriodSeconds屬性,將其設定為0,意味著容器在接收到終止訊號時將立即關閉,而不會等待一段時間來完成未完成的工作。
terminationGracePeriodSeconds: 0
containers:
- name: liveness
image: busybox
imagePullPolicy: IfNotPresent
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 6000
#先建立一個普通的pod
[root@k8scloude1 probe]# kubectl apply -f pod.yaml
pod/liveness-exec created
檢視pod
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 6s 10.244.112.176 k8scloude2 <none> <none>
檢視pod裡的/tmp檔案
[root@k8scloude1 probe]# kubectl exec -it liveness-exec -- ls /tmp
pod執行30秒之後,/tmp/healthy檔案被刪除,pod還會繼續執行6000秒,/tmp/healthy檔案存在就判定pod正常,/tmp/healthy檔案不存在就判定pod異常,但是目前沒有探測機制,所以pod還是正在執行狀態。
[root@k8scloude1 probe]# kubectl exec -it liveness-exec -- ls /tmp
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 3m29s 10.244.112.176 k8scloude2 <none> <none>
刪除pod,新增探測機制
[root@k8scloude1 probe]# kubectl delete -f pod.yaml
pod "liveness-exec" deleted
[root@k8scloude1 probe]# kubectl get pod -o wide
No resources found in probe namespace.
建立具有livenessprobe探測的pod
建立了一個名為liveness-exec的Pod,使用busybox映象來建立一個容器。該容器會執行args引數中的命令:touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600。
Pod還定義了一個名為livenessProbe的屬性來定義liveness探針。該探針使用exec檢查/tmp/healthy檔案是否存在。如果該檔案存在,則Kubernetes認為容器處於健康狀態;否則,Kubernetes將嘗試重啟該容器。
liveness探測將在容器啟動後5秒鐘開始,並每隔5秒鐘執行一次。
[root@k8scloude1 probe]# vim podprobe.yaml
#現在加入健康檢查:command的方式
[root@k8scloude1 probe]# cat podprobe.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
terminationGracePeriodSeconds: 0
containers:
- name: liveness
image: busybox
imagePullPolicy: IfNotPresent
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
#容器啟動的5秒內不監測
initialDelaySeconds: 5
#每5秒檢測一次
periodSeconds: 5
[root@k8scloude1 probe]# kubectl apply -f podprobe.yaml
pod/liveness-exec created
觀察pod裡的/tmp檔案和pod狀態
[root@k8scloude1 probe]# kubectl exec -it liveness-exec -- ls /tmp
healthy
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 18s 10.244.112.177 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl exec -it liveness-exec -- ls /tmp
healthy
[root@k8scloude1 probe]# kubectl exec -it liveness-exec -- ls /tmp
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 36s 10.244.112.177 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 0 43s 10.244.112.177 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 1 50s 10.244.112.177 k8scloude2 <none> <none>
加了探測機制之後,當/tmp/healthy不存在,則會進行livenessProbe重啟pod,如果不加寬限期terminationGracePeriodSeconds: 0,一般75秒的時候會重啟一次
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-exec 1/1 Running 3 2m58s 10.244.112.177 k8scloude2 <none> <none>
刪除pod
[root@k8scloude1 probe]# kubectl delete -f podprobe.yaml
pod "liveness-exec" deleted
[root@k8scloude1 probe]# kubectl get pod -o wide
No resources found in probe namespace.
建立了一個名為liveness-httpget的Pod,使用nginx映象來建立一個容器。該容器設定了一個HTTP GET請求的liveness探針,檢查是否能夠成功存取Nginx的預設主頁/index.html。如果標準無法滿足,則Kubernetes將認為容器不健康,並嘗試重啟該容器。
liveness探測將在容器啟動後10秒鐘開始,並每隔10秒鐘執行一次。failureThreshold屬性表示最大連續失敗次數為3次,successThreshold屬性表示必須至少1次成功才能將容器視為「健康」。timeoutSeconds屬性表示探測請求的超時時間為10秒
。
[root@k8scloude1 probe]# vim podprobehttpget.yaml
#httpGet的方式
[root@k8scloude1 probe]# cat podprobehttpget.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-httpget
spec:
terminationGracePeriodSeconds: 0
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /index.html
port: 80
scheme: HTTP
#容器啟動的10秒內不監測
initialDelaySeconds: 10
#每10秒檢測一次
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
[root@k8scloude1 probe]# kubectl apply -f podprobehttpget.yaml
pod/liveness-httpget created
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-httpget 1/1 Running 0 6s 10.244.112.178 k8scloude2 <none> <none>
檢視/usr/share/nginx/html/index.html檔案
[root@k8scloude1 probe]# kubectl exec -it liveness-httpget -- ls /usr/share/nginx/html/index.html
/usr/share/nginx/html/index.html
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-httpget 1/1 Running 0 2m3s 10.244.112.178 k8scloude2 <none> <none>
刪除/usr/share/nginx/html/index.html檔案
[root@k8scloude1 probe]# kubectl exec -it liveness-httpget -- rm /usr/share/nginx/html/index.html
[root@k8scloude1 probe]# kubectl exec -it liveness-httpget -- ls /usr/share/nginx/html/index.html
ls: cannot access '/usr/share/nginx/html/index.html': No such file or directory
command terminated with exit code 2
觀察pod狀態和/usr/share/nginx/html/index.html檔案,通過埠80探測檔案/usr/share/nginx/html/index.html,探測不到說明檔案有問題,則進行livenessProbe重啟pod。
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-httpget 1/1 Running 1 2m43s 10.244.112.178 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-httpget 1/1 Running 1 2m46s 10.244.112.178 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl exec -it liveness-httpget -- ls /usr/share/nginx/html/index.html
/usr/share/nginx/html/index.html
#通過埠80探測檔案/usr/share/nginx/html/index.html,探測不到說明檔案有問題,則進行livenessProbe重啟pod
[root@k8scloude1 probe]# kubectl exec -it liveness-httpget -- ls /usr/share/nginx/html/index.html
/usr/share/nginx/html/index.html
刪除pod
[root@k8scloude1 probe]# kubectl delete -f podprobehttpget.yaml
pod "liveness-httpget" deleted
[root@k8scloude1 probe]# kubectl get pod -o wide
No resources found in probe namespace.
建立了一個名為liveness-tcpsocket的Pod,使用nginx映象來建立一個容器。該容器設定了一個TCP Socket連線的liveness探針,檢查是否能夠成功連線到指定的埠8080。如果無法連線,則Kubernetes將認為容器不健康,並嘗試重啟該容器。
liveness探測將在容器啟動後10秒鐘開始,並每隔10秒鐘執行一次。failureThreshold屬性表示最大連續失敗次數為3次,successThreshold屬性表示必須至少1次成功才能將容器視為「健康」。timeoutSeconds屬性表示探測請求的超時時間為10秒。
[root@k8scloude1 probe]# vim podprobetcpsocket.yaml
#tcpSocket的方式:
[root@k8scloude1 probe]# cat podprobetcpsocket.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-tcpsocket
spec:
terminationGracePeriodSeconds: 0
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
tcpSocket:
port: 8080
#容器啟動的10秒內不監測
initialDelaySeconds: 10
#每10秒檢測一次
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
[root@k8scloude1 probe]# kubectl apply -f podprobetcpsocket.yaml
pod/liveness-tcpsocket created
觀察pod狀態,因為nginx執行的是80埠,但是我們探測的是8080埠,所以肯定探測失敗,livenessProbe就會重啟pod
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-tcpsocket 1/1 Running 0 10s 10.244.112.179 k8scloude2 <none> <none>
[root@k8scloude1 probe]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-tcpsocket 1/1 Running 1 55s 10.244.112.179 k8scloude2 <none> <none>
刪除pod
[root@k8scloude1 probe]# kubectl delete -f podprobetcpsocket.yaml
pod "liveness-tcpsocket" deleted
下面新增readinessprobe探測
因為readiness probe的探測機制是不重啟的,只是把使用者傳送過來的請求不再轉發到此pod上,為了模擬此情景,建立三個pod,svc把使用者請求轉發到這三個pod上。
小技巧TIPS:要想看文字有沒有對齊,可以使用 :set cuc ,取消使用 :set nocuc
建立pod,readinessProbe探測 /tmp/healthy檔案,如果 /tmp/healthy檔案存在則正常,不存在則異常。lifecycle postStart表示容器啟動之後建立/tmp/healthy檔案。
[root@k8scloude1 probe]# vim podreadinessprobecommand.yaml
[root@k8scloude1 probe]# cat podreadinessprobecommand.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: readiness
name: readiness-exec
spec:
terminationGracePeriodSeconds: 0
containers:
- name: readiness
image: nginx
imagePullPolicy: IfNotPresent
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
#容器啟動的5秒內不監測
initialDelaySeconds: 5
#每5秒檢測一次
periodSeconds: 5
lifecycle:
postStart:
exec:
command: ["/bin/sh","-c","touch /tmp/healthy"]
建立三個名字不同的pod
[root@k8scloude1 probe]# kubectl apply -f podreadinessprobecommand.yaml
pod/readiness-exec created
[root@k8scloude1 probe]# sed 's/readiness-exec/readiness-exec2/' podreadinessprobecommand.yaml | kubectl apply -f -
pod/readiness-exec2 created
[root@k8scloude1 probe]# sed 's/readiness-exec/readiness-exec3/' podreadinessprobecommand.yaml | kubectl apply -f -
pod/readiness-exec3 created
檢視pod的標籤
[root@k8scloude1 probe]# kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
readiness-exec 1/1 Running 0 23s 10.244.112.182 k8scloude2 <none> <none> test=readiness
readiness-exec2 1/1 Running 0 15s 10.244.251.236 k8scloude3 <none> <none> test=readiness
readiness-exec3 0/1 Running 0 9s 10.244.112.183 k8scloude2 <none> <none> test=readiness
三個pod的標籤是一樣的
[root@k8scloude1 probe]# kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
readiness-exec 1/1 Running 0 26s 10.244.112.182 k8scloude2 <none> <none> test=readiness
readiness-exec2 1/1 Running 0 18s 10.244.251.236 k8scloude3 <none> <none> test=readiness
readiness-exec3 1/1 Running 0 12s 10.244.112.183 k8scloude2 <none> <none> test=readiness
為了標識3個pod的不同,修改nginx的index檔案
[root@k8scloude1 probe]# kubectl exec -it readiness-exec -- sh -c "echo 111 > /usr/share/nginx/html/index.html"
[root@k8scloude1 probe]# kubectl exec -it readiness-exec2 -- sh -c "echo 222 > /usr/share/nginx/html/index.html"
[root@k8scloude1 probe]# kubectl exec -it readiness-exec3 -- sh -c "echo 333 > /usr/share/nginx/html/index.html"
建立一個service服務,把使用者請求轉發到這三個pod上
[root@k8scloude1 probe]# kubectl expose --name=svc1 pod readiness-exec --port=80
service/svc1 exposed
test=readiness這個標籤有3個pod
[root@k8scloude1 probe]# kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
svc1 ClusterIP 10.101.38.121 <none> 80/TCP 23s test=readiness
[root@k8scloude1 probe]# kubectl get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
readiness-exec 1/1 Running 0 7m14s test=readiness
readiness-exec2 1/1 Running 0 7m6s test=readiness
readiness-exec3 1/1 Running 0 7m test=readiness
存取service 服務 ,發現使用者請求都分別轉發到三個pod
[root@k8scloude1 probe]# while true ; do curl -s 10.101.38.121 ; sleep 1 ; done
333
111
333
222
111
......
刪除pod readiness-exec2的探測檔案
[root@k8scloude1 probe]# kubectl exec -it readiness-exec2 -- rm /tmp/healthy
因為/tmp/healthy探測不成功,readiness-exec2的READY狀態變為了0/1,但是STATUS還為Running狀態,還可以進入到readiness-exec2 pod裡。由於readinessprobe只是不把使用者請求轉發到異常pod,所以異常pod不會被刪除。
[root@k8scloude1 probe]# kubectl get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
readiness-exec 1/1 Running 0 10m test=readiness
readiness-exec2 0/1 Running 0 10m test=readiness
readiness-exec3 1/1 Running 0 10m test=readiness
[root@k8scloude1 probe]# kubectl exec -it readiness-exec2 -- bash
root@readiness-exec2:/# exit
exit
kubectl get ev (檢視事件),可以看到「88s Warning Unhealthy pod/readiness-exec2 Readiness probe failed: cat: /tmp/healthy: No such file or directory」警告
[root@k8scloude1 probe]# kubectl get ev
LAST SEEN TYPE REASON OBJECT MESSAGE
......
32m Normal Pulled pod/readiness-exec2 Container image "nginx" already present on machine
32m Normal Created pod/readiness-exec2 Created container readiness
32m Normal Started pod/readiness-exec2 Started container readiness
15m Normal Killing pod/readiness-exec2 Stopping container readiness
13m Normal Scheduled pod/readiness-exec2 Successfully assigned probe/readiness-exec2 to k8scloude3
13m Normal Pulled pod/readiness-exec2 Container image "nginx" already present on machine
13m Normal Created pod/readiness-exec2 Created container readiness
13m Normal Started pod/readiness-exec2 Started container readiness
88s Warning Unhealthy pod/readiness-exec2 Readiness probe failed: cat: /tmp/healthy: No such file or directory
32m Normal Scheduled pod/readiness-exec3 Successfully assigned probe/readiness-exec3 to k8scloude3
32m Normal Pulled pod/readiness-exec3 Container image "nginx" already present on machine
32m Normal Created pod/readiness-exec3 Created container readiness
32m Normal Started pod/readiness-exec3 Started container readiness
15m Normal Killing pod/readiness-exec3 Stopping container readiness
13m Normal Scheduled pod/readiness-exec3 Successfully assigned probe/readiness-exec3 to k8scloude2
13m Normal Pulled pod/readiness-exec3 Container image "nginx" already present on machine
13m Normal Created pod/readiness-exec3 Created container readiness
13m Normal Started pod/readiness-exec3 Started container readiness
再次存取service服務,發現使用者請求只轉發到了111和333,說明readiness probe探測生效。
[root@k8scloude1 probe]# while true ; do curl -s 10.101.38.121 ; sleep 1 ; done
111
333
333
333
111
......
通過本文,您應該已經瞭解到如何使用livenessprobe探測和readinessprobe探測來監視Kubernetes中容器的健康狀態。通過定期檢查服務狀態、命令退出碼、HTTP響應和記憶體使用情況,您可以自動重啟不健康的容器,並提高應用的可用性和穩定性。