使用 KRAWL 掃描 Kubernetes 錯誤

2020-02-27 10:29:00

用 KRAWL 指令碼來識別 Kubernetes Pod 和容器中的錯誤。

當你使用 Kubernetes 執行容器時,你通常會發現它們堆積在一起。這是設計使然。它是容器的優點之一:每當需要新的容器時,它們啟動成本都很低。你可以使用前端工具(如 OpenShift 或 OKD)來管理 Pod 和容器。這些工具使視覺化設定變得容易,並且它具有一組豐富的用於快速互動的命令。

如果管理容器的平台不符合你的要求,你也可以僅使用 Kubernetes 工具鏈獲取這些資訊,但這需要大量命令才能全面瞭解複雜環境。出於這個原因,我編寫了 KRAWL,這是一個簡單的指令碼,可用於掃描 Kubernetes 叢集名稱空間下的 Pod 和容器,並在發現任何事件時,顯示事件的輸出。它也可用作為 Kubernetes 外掛使用。這是獲取大量有用資訊的快速簡便方法。

先決條件

  • 必須安裝 kubectl
  • 叢集的 kubeconfig 設定必須在它的預設位置($HOME/.kube/config)或已被匯出到環境變數(KUBECONFIG=/path/to/kubeconfig)。

使用

$ ./krawl

KRAWL script

指令碼

#!/bin/bash# AUTHOR: Abhishek Tamrakar# EMAIL: [email protected]# LICENSE: Copyright (C) 2018 Abhishek Tamrakar##  Licensed under the Apache License, Version 2.0 (the "License");#  you may not use this file except in compliance with the License.#  You may obtain a copy of the License at##       http://www.apache.org/licenses/LICENSE-2.0##   Unless required by applicable law or agreed to in writing, software#   distributed under the License is distributed on an "AS IS" BASIS,#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.#   See the License for the specific language governing permissions and#   limitations under the License.###define the variablesKUBE_LOC=~/.kube/config#define variablesKUBECTL=$(which kubectl)GET=$(which egrep)AWK=$(which awk)red=$(tput setaf 1)normal=$(tput sgr0)# define functions# wrapper for printing info messagesinfo(){  printf '\n\e[34m%s\e[m: %s\n' "INFO" "$@"}# cleanup when all donecleanup(){  rm -f results.csv}# just check if the command we are about to call is availablecheckcmd(){  #check if command exists  local cmd=$1  if [ -z "${!cmd}" ]  then    printf '\n\e[31m%s\e[m: %s\n' "ERROR"  "check if $1 is installed !!!"    exit 1  fi}get_namespaces(){  #get namespaces  namespaces=( \          $($KUBECTL get namespaces --ignore-not-found=true | \          $AWK '/Active/ {print $1}' \          ORS=" ") \          )#exit if namespaces are not foundif [ ${#namespaces[@]} -eq 0 ]then  printf '\n\e[31m%s\e[m: %s\n' "ERROR"  "No namespaces found!!"  exit 1fi}#get events for pods in errored stateget_pod_events(){  printf '\n'  if [ ${#ERRORED[@]} -ne 0 ]  then      info "${#ERRORED[@]} errored pods found."      for CULPRIT in ${ERRORED[@]}      do        info "POD: $CULPRIT"        info        $KUBECTL get events \        --field-selector=involvedObject.name=$CULPRIT \        -ocustom-columns=LASTSEEN:.lastTimestamp,REASON:.reason,MESSAGE:.message \        --all-namespaces \        --ignore-not-found=true      done  else      info "0 pods with errored events found."  fi}#define the logicget_pod_errors(){  printf "%s %s %s\n" "NAMESPACE,POD_NAME,CONTAINER_NAME,ERRORS" > results.csv  printf "%s %s %s\n" "---------,--------,--------------,------" >> results.csv  for NAMESPACE in ${namespaces[@]}  do    while IFS=' ' read -r POD CONTAINERS    do      for CONTAINER in ${CONTAINERS//,/ }      do        COUNT=$($KUBECTL logs --since=1h --tail=20 $POD -c $CONTAINER -n $NAMESPACE 2>/dev/null| \        $GET -c '^error|Error|ERROR|Warn|WARN')        if [ $COUNT -gt 0 ]        then            STATE=("${STATE[@]}" "$NAMESPACE,$POD,$CONTAINER,$COUNT")        else        #catch pods in errored state            ERRORED=($($KUBECTL get pods -n $NAMESPACE --no-headers=true | \                awk '!/Running/ {print $1}' ORS=" ") \                )        fi      done    done< <($KUBECTL get pods -n $NAMESPACE --ignore-not-found=true -o=custom-columns=NAME:.metadata.name,CONTAINERS:.spec.containers[*].name --no-headers=true)  done  printf "%s\n" ${STATE[@]:-None} >> results.csv  STATE=()}#define usage for seprate runusage(){cat << EOF  USAGE: "${0##*/} </path/to/kube-config>(optional)"  This program is a free software under the terms of Apache 2.0 License.  COPYRIGHT (C) 2018 Abhishek TamrakarEOFexit 0}#check if basic commands are foundtrap cleanup EXITcheckcmd KUBECTL##set the groundif [ $# -lt 1 ]; then  if [ ! -e ${KUBE_LOC} -a ! -s ${KUBE_LOC} ]  then    info "A readable kube config location is required!!"    usage  fielif [ $# -eq 1 ]then  export KUBECONFIG=$1elif [ $# -gt 1 ]then  usagefi#playget_namespacesget_pod_errorsprintf '\n%40s\n' 'KRAWL'printf '%s\n' '---------------------------------------------------------------------------------'printf '%s\n' '  Krawl is a command line utility to scan pods and prints name of errored pods   'printf '%s\n\n' ' +and containers within. To use it as kubernetes plugin, please check their page 'printf '%s\n' '================================================================================='cat results.csv | sed 's/,/,|/g'| column -s ',' -tget_pod_events

此文最初發布在 KRAWL 的 GitHub 倉庫下的 README 中,並被或許重用。