Alertmanager 處理由使用者端應用程式(如 Prometheus server)傳送的警報。它負責去重(deduplicating),分組(grouping),並將它們路由(routing)到正確的接收器(receiver)整合,如電子郵件,微信,或釘釘。它還負責處理警報的靜默/遮蔽(silencing)、定時傳送/不傳送(Mute)和抑制(inhibition)問題。
AlertManager 作為 開源的為 Prometheus 而設計的告警應用, 已經具備了告警應用各類豐富、靈活、可客製化的功能:
用於JIRA的Prometheus Alertmanager Webhook Receiver。
JIRAlert實現了Alertmanager的webhook HTTP API,並連線到一個或多個JIRA範例以建立高度可設定的JIRA Issues。每個不同的 Groupkey 建立一個Issue--由Alertmanager的路由設定部分的group_by
引數定義--但在警報解決時不會關閉(預設引數, 可調整)。我們的期望是,人們會檢視這個issue。,採取任何必要的行動,然後關閉它。如果沒有人的互動是必要的,那麼它可能首先就不應該報警。然而,這種行為可以通過設定auto_resolve
部分進行修改,它將以所需的狀態解決jira issue。
如果一個相應的JIRA issue。已經存在,但被解決了,它將被重新開啟(reopened)。在解決的狀態和重開的狀態之間必須存在一個JIRA transition--如reopen_state
--否則重開將失敗。可以選擇定義一個 "won't fix" 的決議(resolution)--由wont_fix_resolution
定義:有此決議的JIRA問題將不會被JIRAlert重新開啟。
Jiralert 的安裝比較簡單, 主要由 Deployment、Secret(Jiralert 的設定)和 Service 組成。典型範例如下:
apiVersion: apps/v1
kind: Deployment
metadata:
name: jiralert
spec:
selector:
matchLabels:
app: jiralert
template:
metadata:
labels:
app: jiralert
spec:
containers:
- name: jiralert
image: quay.io/jiralert/jiralert-linux-amd64:latest
imagePullPolicy: IfNotPresent
args:
- "--config=/jiralert-config/jiralert.yml"
- "--log.level=debug"
- "--listen-address=:9097"
readinessProbe:
tcpSocket:
port: 9097
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
livenessProbe:
tcpSocket:
port: 9097
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
ports:
- containerPort: 9091
name: metrics
volumeMounts:
- mountPath: /jiralert-config
name: jiralert-config
readOnly: true
volumes:
- name: jiralert-config
secret:
secretName: jiralert-config
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: jiralert-config
stringData:
jiralert.tmpl: |-
{{ define "jira.summary" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join "," }}{{ end }}
{{ define "jira.description" }}{{ range .Alerts.Firing }}Labels:
{{ range .Labels.SortedPairs }} - {{ .Name }} = {{ .Value }}
{{ end }}
Annotations:
{{ range .Annotations.SortedPairs }} - {{ .Name }} = {{ .Value }}
{{ end }}
Source: {{ .GeneratorURL }}
{{ end }}
CommonLabels:
{{ range .CommonLabels.SortedPairs }} - {{ .Name }} = {{ .Value}}
{{ end }}
GroupLabels:
{{ range .GroupLabels.SortedPairs }} - {{ .Name }} = {{ .Value}}
{{ end }}
{{ end }}
jiralert.yml: |-
# Global defaults, applied to all receivers where not explicitly overridden. Optional.
template: jiralert.tmpl
defaults:
# API access fields.
api_url: https://jira.example.com
user: foo
password: bar
# The type of JIRA issue to create. Required.
issue_type: Bug
# Issue priority. Optional.
priority: Major
# Go template invocation for generating the summary. Required.
summary: '{{ template "jira.summary" . }}'
# Go template invocation for generating the description. Optional.
description: '{{ template "jira.description" . }}'
# State to transition into when reopening a closed issue. Required.
reopen_state: "REOPENED"
# Do not reopen issues with this resolution. Optional.
wont_fix_resolution: "Won't Fix"
# Amount of time after being closed that an issue should be reopened, after which, a new issue is created.
# Optional (default: always reopen)
# reopen_duration: 30d
# Receiver definitions. At least one must be defined.
# Receiver names must match the Alertmanager receiver names. Required.
receivers:
- name: 'jiralert'
project: 'YOUR-JIRA-PROJECT'
---
apiVersion: v1
kind: Service
metadata:
name: jiralert
spec:
selector:
app: jiralert
ports:
- port: 9097
targetPort: 9097
相應 AlertManager 的設定:
...
receivers:
- name: jiralert
webhook_configs:
- send_resolved: true
url: http://jiralert:9097/alert
routes:
- receiver: jiralert
matchers:
- severity = critical
continue: true
...