Script exporter
|
|
The script exporter is a generic exporter that is only used to call local self-written scripts and return their success, runtime and, if applicable, response.
Response.
It can therefore be used for any application, e.g. to return information from files or to evaluate curl with jq and thus evaluate data from the BPC and make it available for VIMON, for example.
Create/customize scripts
The scripts should be stored at vimon/config/script_exporter/scripts.
Two scripts that query and provide data from a BPC monitor are already provided here.
The output syntax of a script should always correspond to the Prometheus syntax (1..n lines):
metrikname{label1="Foo", label2="Bar"} resultValue(number) unixtimestamp
The data is later referenced in Grafana with the metric name, filtered with the Labels, aggregated or otherwise evaluated.
The result value is always a number (although truth values can also be mapped with 1 and 0).
The Unix timestamp is optional.
If this is not available, Prometheus writes the value to jetzt.
It makes sense to specify a timestamp if, for example, a value was valid at the beginning of the last hour but is determined later.
Configuration
The configuration takes place via the file vimon/config/script_exporter/script_exporter.yml.
The scripts are stored here and given an alias.
|
Only scripts that are defined here can then be queried externally. |
The query from Prometheus must then be made to the URL /probe and the Parameters script must be transmitted with the alias defined on the exporter.
This could be done in Prometheus as follows:
- job_name: 'bpcProcessesProd'
scrape_interval: 3600s
metrics_path: /probe
static_configs:
- targets: ['dingeldangelhost.com:9469']
params:
script: ['bpc_client_processes_prod']
Example configuration
Example script for monitoring a cluster node status:
Details
# CheckIS-Script
# Author tk@virtimo
# returns:
# 0 if the inubit Process Engine is running and active (maintenancemode is inactive)
# 1 if the script failes
# 2 if processing is prohibited by file
# 3 if retrieval via loadbalancer failed
# 4 if this is not the active site
# 5 if maintenancemode is active
# 6 if processing is disabled by heartbeat-connector
# 7 if process engine is down
# 8 if curl failed by any other return value
#
# test with eg "echo $?" after call of this script
#
###############################################################
if [ "$#" -ne 1 ]; then
echo "Usage: $0 TIMEOUT" >&2
exit 1
fi
EMERGENCY_STOP_FILE=/inubit/filebase/conf/keepalived_emergencyStop.active
IGNORE_SITE_FILE=/inubit/filebase/conf/keepalived_ignoreSite.active
LOCAL_LOCATION_FROM=/etc/haproxy/monitor-response-200.http
LB_STATUS_URL="https://inubit.int.viritmo.de:8443/status"
WSLISTENERURL="http://localhost:7000/ibis/ws/genericMain_wsc_heartbeatConnector?wsdl"
HTTPTIMEOUT=$1
# check for existence of "emergency stop" file - if we find it we exit immediately with "fault" status
if [ -f "$EMERGENCY_STOP_FILE" ]; then
echo "processing prohibited - $EMERGENCY_STOP_FILE exists"
exit 2
fi
# find out, if external LB directs external requests to this site or not
if [ ! -f "$IGNORE_SITE_FILE" ]; then
MYLOC=`tail -1 "$LOCAL_LOCATION_FROM" | cut -d" " -f3`
LBLOC=`curl --silent --connect-timeout 2 -k "$LB_STATUS_URL"`
RETCODE=$?
if [ "$RETCODE" -ne 0 ]; then
# curl call has failed - exit with "fault" status
echo "processing disabled - site-check via $LB_STATUS_URL failed"
exit 3
fi
LBLOC=`echo "$LBLOC" | tail -1 | cut -d" " -f3`
if [ "$MYLOC" != "$LBLOC" ]; then
# wrong "site" - let's bail out right here
echo "processing disabled - we ($MYLOC) are not the current master-site ($LBLOC)"
exit 4
fi
fi
# check status of local IS instance
HTTPSTATUS=`curl -s -o /dev/null -m $HTTPTIMEOUT -w "%{http_code}" $WSLISTENERURL`
RETCODE=$?
if [ "$HTTPSTATUS" == 200 ]; then
echo "processing active"
exit 0
elif [ "$HTTPSTATUS" == 503 ]; then
echo "maintenancemode active"
exit 5
elif [ "$HTTPSTATUS" == 404 ]; then
echo "heartbeat-webservice missing/url wrong"
exit 6
elif [ "$HTTPSTATUS" == 000 ]; then
echo "processengine down"
exit 7
else
Example configuration of the exporter:
scripts:
- name: fail
script: curl google.foo
# optional
#timeout:
# in seconds, 0 or negative means none
#max_timeout: 5
#enforced: false
- name: fine
script: curl google.com
# optional
#timeout:
# in seconds, 0 or negative means none
#max_timeout: 5
#enforced: false
- name: isstatus
script: /inubit/filebase/scripts/keepalived_checkis.sh 5
- name: activesiteint
script: /inubit/filebase/scripts/getActiveSite.sh http://inubit.int.virtimo.de:8000/status
The exporter was built according to the current Prometheus guidelines, so it now delivers the data at /probe?script=<ScriptName>.
/metrics
The additional URL Parameter output=ignore ensures that the actual output of the script is not supplied - otherwise it would also be returned as part of the response.
Therefore, the prometheus.yml must also be configured differently here.
Example:
- job_name: 'clusterActiveSite'
metrics_path: /probe
params:
script: ['activesiteint', 'activesitetest', 'activesitedev']
output: ['ignore']
static_configs:
- targets: ['inubitserver.virtimo.lan:9469']
- job_name: 'clusterIsState'
metrics_path: /probe
params:
script: ['isstatus']
output: ['ignore']
static_configs:
- targets: ['server1.virtimo.lan:9469', 'server2.virtimo.lan:9469']
The response then looks like this:
# virtimo script-exporter, author TK, modified to deliver status codes and format for grafana status map.
# HELP script_statusmap Script exit status as label status with value always 1 for grafana statusmap-plugin.
# TYPE script_statusmap gauge
script_statusmap{script="isstatus", status="6"} 1
# HELP script_returncode Script exit status (0 = ok, != 0 any exit code).
# TYPE script_returncode gauge
script_returncode{script="isstatus"} 6
# HELP script_success Script success status (1 = ok, 0 error).
# TYPE script_success gauge
script_success{script="isstatus"} 0
# HELP script_duration_seconds Script execution time, in seconds.
# TYPE script_duration_seconds gauge
script_duration_seconds{script="isstatus"} 0.025820
-
The metric
script_returncodereturns the return code -
script_successonly returns 1 or 0 - depending on success or not -
script_duration_secondsis the pure script runtime -
script_statusmapalways returns 1 and the return code in the LabelStatus.This special feature is required for the use of the status map plugin in Grafana, see https://grafana.com/grafana/plugins/flant-statusmap-panel. This can be used to generate a cluster overview, for example.