Prometheus Operator 與 kube-prometheus 之一-簡介

簡介

Prometheus Operator

Prometheus Operator: 在 Kubernetes 上管理 Prometheus 叢集。該專案的目的是簡化和自動化基於 Prometheus 的 Kubernetes 叢集監控堆疊的設定。

kube-prometheus

最簡單的方法是將 Prometheus Operator 作為 kube-prometheus 的一部分進行部署。kube-prometheus 部署了 Prometheus Operator，並且已經安排了一個名為 prometheus-k8s 的 prometheus，預設帶有警報和規則，並且帶有其他 prometheus 需要的元件，如：

Grafana
kube-state-metrics
prometheus adapter
node exporter
...

Prometheus Operator vs. kube-prometheus vs. community helm chart

Prometheus Operator

Prometheus Operator 使用 Kubernetes 自定義資源，簡化了 Prometheus、Alertmanager 和相關監控元件的部署和設定。

kube-prometheus

kube-prometheus 提供了一個基於 Prometheus 和 Prometheus Operator 的完整叢集監控堆疊的範例設定。這包括部署多個 Prometheus 和 Alertmanager 範例、用於收集節點指標的指標匯出器（如 node_exporters)、將 Prometheus 連結到各種指標端點的目標設定，以及用於通知叢集中潛在問題的範例警報規則。

helm chart

prometheus-community/kube-prometheus-stack helm chart 提供了與 kube-prometheus 相似的特性集。這張 chart 是由 prometheus 社群維護的。

Prometheus Operator 功能

CRD

Prometheus Operator 的一個核心特性是 watch Kubernetes API 伺服器對特定物件的更改，並確保當前 Prometheus 部署與這些物件匹配。Operator 對以下自定義資源定義 (crd) 進行操作：

monitoring.coreos.com/v1:

Prometheus: 它定義了 Prometheus 期望的部署。
Alertmanager: 它定義了 AlertManager 期望的部署。
ThanosRuler: 它定義了 ThanosRuler 期望的部署；如果有多個 Prometheus 範例，則通過 ThanosRuler 進行告警規則的統一管理。
ServiceMonitor: Prometheus Operator 通過 PodMonitor 和 ServiceMonitor 實現對資源的監控，ServiceMonitor 用於通過 Service 對 K8S 中的任何資源進行監控，推薦首選 ServiceMonitor. 它宣告性地指定了 Kubernetes service 應該如何被監控。Operator 根據 API 伺服器中物件的當前狀態自動生成 Prometheus 刮擦設定。
PodMonitor: Prometheus Operator 通過 PodMonitor 和 ServiceMonitor 實現對資源的監控，PodMonitor 用於對 Pod 進行監控，推薦首選 ServiceMonitor. PodMonitor 宣告性地指定了應該如何監視一組 pod。Operator 根據 API 伺服器中物件的當前狀態自動生成 Prometheus 刮擦設定。
Probe: 它宣告性地指定了應該如何監視 ingress 或靜態目標組。Operator 根據定義自動生成 Prometheus 刮擦設定。
PrometheusRule: 用於管理 Prometheus 告警規則；它定義了一套所需的 Prometheus 警報和/或記錄規則。Prometheus 生成一個規則檔案，可以被 Prometheus 範例使用。
AlertmanagerConfig: 用於管理 AlertManager 組態檔，主要是告警發給誰；它宣告性地指定 Alertmanager 設定的子部分，允許將警報路由到自定義接收器，並設定禁止規則。

Prometheus Operator 自動檢測 Kubernetes API 伺服器對上述任何物件的更改，並確保匹配的部署和設定保持同步。

簡化的部署設定

設定 Prometheus 的基礎知識，如版本、永續性、保留策略和來自本機 Kubernetes 資源的副本。最簡的持久化的 Prometheus 的部署，只需要建立如下 yaml 即可：

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: persisted
spec:
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: ssd
        resources:
          requests:
            storage: 40Gi

Prometheus 目標設定

根據熟悉的 Kubernetes 標籤查詢自動生成監控目標設定；無需學習普羅米修斯特定的設定語言。

大廠案例

哪些大廠在用 Prometheus Operator 或 kube-prometheus?

RedHat

從 Prometheus Operator 的 API 也能看出來，這個 Operator 最早是由 CoreOS 開發並開源的，而現在 CoreOS 已經被 RedHat 收購，所以 RedHat 的 OpenShift 4 完全是採用 Prometheus Operator 作為它的 Metrics 解決方案的。典型的架構如下圖：

可以看到 Prometheus 和 AlertManager 都是通過 Prometheus Operator 來進行管理的。

Rancher

Rancher 2 以後的 rancher-monitoring 也是基於 kube-prometheus 做了進一步的改進而來的，這是通過 rancher-monitoring helm chart 部署後的關係圖，可以看到部署的元件還是非常多的：

Grafana
Prometheus CRD
Prometheus Operator
Prometheus
AlertManager
kube-state-metrics
prometheus adapter
node exporter
...

我為什麼推薦你用 Prometheus Operator 或 kube-prometheus 而非原生 prometheus?

理由如下：

眾多大廠的選擇；
極大簡化了 Prometheus 的設定複雜度；
開箱即用的大量：
1. 監控物件，如：K8S 元件 - coredns, kubelet, controller manager, apiserver, etcd, scheduler, kube proxy; 監控元件自監控 - grafana, AlertManager, prometheus 等；
2. 儀表板，自帶 24 個儀表板，非常實用，涵蓋：叢集/元件/網路/儲存/Node/Pod 等等維度；
3. 告警規則，自帶了 100 多個告警規則，涵蓋 K8S 的方方面面；
流行的開源產品，很多也預設會帶有對 Prometheus Operator 的支援，如 Loki 就有相關的 ServiceMonitor;
通過 ServiceMonitor 等，其實反而相比新增 Prometheus Annotation 有更大的靈活性；如下面的例子
高可用的支援，如：
1. 多個 Prometheus 的 shards
2. 多個 AlertManager
3. ThanosRuler
RBAC: 如預設可以建立 3 個 monitoring 的角色：admin/edit/viewer, 可以分別對應監控的管理員，維護人員和唯讀使用者；

範例，靈活性：

spec:
  endpoints:
    - honorLabels: true
      params:
        _scheme:
          - https
      port: metrics
      proxyUrl: http://pushprox-k3s-server-proxy.cattle-monitoring-system.svc:8080
      relabelings:
        - sourceLabels:
            - __metrics_path__
          targetLabel: metrics_path
  jobLabel: component
  namespaceSelector:
    matchNames:
      - cattle-monitoring-system
  podTargetLabels:
    - component
    - pushprox-exporter
  selector:
    matchLabels:
      component: k3s-server
      k8s-app: pushprox-k3s-server-client
      provider: kubernetes
      release: rancher-monitoring

Prometheus Operator 與 kube-prometheus 之一-簡介

簡介