prometheus-監控docker伺服器

1. prometheus-監控docker伺服器

cAdvisor（Container Advisor）：用於收集正在執行的容器資源使用和效能資訊。
專案地址：https://github.com/google/cadvisor

docker部署cAdvisor範例：

docker run -d \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest

案例：監控docker伺服器

監控執行命令

docker run -d \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest

執行建立監控docker服務

[root@VM-0-17-centos ~]# docker run -d \
> --volume=/:/rootfs:ro \
> --volume=/var/run:/var/run:ro \
> --volume=/sys:/sys:ro \
> --volume=/var/lib/docker/:/var/lib/docker:ro \
> --volume=/dev/disk/:/dev/disk:ro \
> --publish=8080:8080 \
> --detach=true \
> --name=cadvisor \
> google/cadvisor:latest
Unable to find image 'google/cadvisor:latest' locally
latest: Pulling from google/cadvisor
ff3a5c916c92: Pull complete 
44a45bb65cdf: Pull complete 
0bbe1a2fe2a6: Pull complete 
Digest: sha256:815386ebbe9a3490f38785ab11bda34ec8dacf4634af77b8912832d4f85dca04
Status: Downloaded newer image for google/cadvisor:latest
78d6d7db3b715f5800346cd592575a4b7be5e644e198dbf95160e64c3545fa53

進行資料存取http://ip:8080

設定prometheus新增服務

[root@prometheus ~]# cd /opt/monitor/
[root@prometheus monitor]# ll
total 23072
drwxr-xr-x 2 3434 3434       93 Jun  7 14:39 alertmanager
-rw-r--r-- 1 root root 23624308 May 11 04:11 alertmanager-0.22.0-rc.1.linux-amd64.tar.gz
drwxr-xr-x 8 root root      157 Jun  6 17:18 grafana
drwxr-xr-x 5 3434 3434      145 Jun  7 17:07 prometheus
[root@prometheus monitor]# cd prometheus/
[root@prometheus prometheus]# ll
total 167980
drwxr-xr-x 2 3434 3434       38 Mar 17 04:20 console_libraries
drwxr-xr-x 2 3434 3434      173 Mar 17 04:20 consoles
-rw-r--r-- 1 3434 3434    11357 Mar 17 04:20 LICENSE
-rw-r--r-- 1 3434 3434     3420 Mar 17 04:20 NOTICE
-rwxr-xr-x 1 3434 3434 91044140 Mar 17 02:10 prometheus
-rw-r--r-- 1 3434 3434     1043 Jun  7 17:07 prometheus.yml
-rwxr-xr-x 1 3434 3434 80944687 Mar 17 02:12 promtool
drwxr-xr-x 2 root root       22 Jun  7 14:43 rules
[root@prometheus prometheus]# vim prometheus.yml 
[root@prometheus prometheus]# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - 127.0.0.1:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules/*.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['127.0.0.1:9090']
  - job_name: 'linux server'
    static_configs:
      - targets: ['121.4.78.187:9100']
        labels:
          prod: 'web1'
  - job_name: 'docker server'
    static_configs:
      - targets: ['121.4.63.211:8080']
        labels:
          prod: 'web2'

重新載入prometheus服務

[root@prometheus prometheus]# /bin/systemctl restart prometheus
[root@prometheus prometheus]# /bin/systemctl status prometheus
● prometheus.service - prometheus
   Loaded: loaded (/usr/lib/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-06-07 22:53:36 CST; 4s ago
 Main PID: 14647 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─14647 /opt/monitor/prometheus/prometheus --config.file=/opt/monitor/prometheus/prometheus.yml

Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.914Z caller=head.go:740 component=tsdb msg="WAL segment loaded" segment=1...egment=19
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.958Z caller=head.go:740 component=tsdb msg="WAL segment loaded" segment=1...egment=19
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.990Z caller=head.go:740 component=tsdb msg="WAL segment loaded" segment=1...egment=19
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.990Z caller=head.go:740 component=tsdb msg="WAL segment loaded" segment=1...egment=19
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.990Z caller=head.go:745 component=tsdb msg="WAL replay completed" checkpo....353439ms
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.993Z caller=main.go:799 fs_type=XFS_SUPER_MAGIC
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.993Z caller=main.go:802 msg="TSDB started"
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.993Z caller=main.go:928 msg="Loading configuration file" filename=/opt/mo...theus.yml
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.996Z caller=main.go:959 msg="Completed loading of configuration file" filename=/op…µs
Jun 07 22:53:36 prometheus prometheus[14647]: level=info ts=2021-06-07T14:53:36.996Z caller=main.go:751 msg="Server is ready to receive web requests."
Hint: Some lines were ellipsized, use -l to show in full.

瀏覽器驗證prometheus組態檔是否生成
使用grafana進行監控docker服務資料展示
- 匯入監控docker的儀表盤，ID為193
- 填寫名稱，選擇資料來源
- 發現已有監控資料了
監控docker服務新增一個導航欄

點選save dashboard儲存

發現有了導航點選資料沒有變化
我們需要修改圖表資訊

每張圖片新增如上資訊

發現修改之後，就有了變化了