Prometheus Alert Manager

The Prometheus Alertmanager should be used for alerting. This supports the grouping of alerts according to predefined Labels (e.g. "server instances") and routing via various third-party providers, e.g. via JSON via a web hook or by email.

A list of integrations is available at here.

The alerts themselves must be set in the Prometheus configuration. A separate alertrules.yml file can also be created for this purpose, which is referenced in the Prometheus configuration.

Example configuration

The following example configuration checks whether the free RAM of a node exporter falls below the threshold of 10%:

alertrules.yml
groups:
- name: Node_Exporter.SystemAlerts
  rules:
  - alert: HostOutOfMemory
    expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 10
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: Host out of memory (instance {{ $labels.instance }})
      description: Node memory is filling up (< 10% left)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}

Adjust path

The path to the Alert Manager in Prometheus must also be configured (by default localhost:9093):

prometheus.yml
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - localhost:9093
rule_files:
  - "alertrules.yml"

Alert processing

Alert processing is customized in the Alert Manager configuration. Alert receivers can be configured here (here an SMTP receiver as an example) and grouping and inhibition rules can be defined.

Alerts can be grouped according to certain Labels (route elements), and time intervals can be set for alert groups to control the distribution of alerts. Inhibition rules can be set, for example to create hierarchical alert structures (e.g. according to severity).

alertmanager.yml
global:
  resolve_timeout: 5m
  smtp_smarthost: inubit.nemesys:25
  smtp_from: alertmanager@virtimo.de

route:
  group_by: ['instance','alertname']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'smtp'

receivers:
- name: 'smtp'
  email_configs:
  - to: alerting@virtimo.de

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']