Here I am trying to present (I hope in an easy way to understand) some Icinga concepts (maybe also applicable in Nagios) like active and passive checks, enabling and disabling freshness checks, soft and hard states,… and the related parameters to configure them.
ACTIVE and PASSIVE services
There are two types of services:
- ACTIVE: Check initiated by the Icinga system itself and performed every certain configured time.
- PASSIVE: Check initiated by applications or external systems and that send the result to Icinga so that it is processed.
Normaly agents like nsclient can be used to send passive checks results in Windows and Linux… I use other more “manual” method so that I can perform any check on almost any system and language in a very versatile way (check this related Nagios post: Nagios – Using passive checks without agent)
SOFT and HARD states
When detected a check problem, soft or hard state is set for the related service / host.
This allows avoiding notifications for short-term problems (for example, a momentary CPU peak or punctual connectivity problem).
When soft state is set (it is not yet known if the problem is not temporary) NO notification is sent.
When hard status is established (confirmation that the problem is not temporary) notification is sent.
In order to make that distinction, Icinga uses these two parameters:
- max_check_attempts: Number of times that the problem must occur in order to go to hard state
- retry_interval: Time between rechecks after moving to soft state. During the rechecks the state remains soft.
Every failed recheck increments in 1 the “check attempts”.
When reached the number configured as max_check_attempts (in this example: 3), the service/host reaches HARD state.
FRESHNESS checks
Applied only to passive checks. Icinga is able to check if too much time has elapsed without getting check updates (it is possible that the system or external application has some problem and can not send the results of the checks).
In that case, “UNKNOWN” state is set, the message “No Passive Check Result Received” is displayed and a notification is also sent.
To enable them, the service must have passive and active checks enabled at the same time:
enable_active_checks = true
enable_passive_checks = true
After all, the freshness check is done by Icinga, so it can be considered as an active check.
The time is configurable by using the parameter check_interval
Examples
This service templates show multiple ways to configure the services and their parameters.
ACTIVE check – HARD
Active check-up every 2 minutes.
When a problem is detected for the first time, HARD state is set (no SOFT state) and, therefore, sends notification notificación.
template Service "active-service-HARD" {
import "generic-service"
enable_active_checks = true
enable_passive_checks = false
// send alert after 1st fail
max_check_attempts = 1
check_interval = 2m
}
ACTIVE Check – SOFT
Active check-up every minute.
When check fail, SOFT state is set until the 3rd fail.
Rechecks in SOFT state every 30 seconds.
template Service "active-service-SOFT" {
import "generic-service"
enable_active_checks = true
enable_passive_checks = false
// send alert after 3rd fail
max_check_attempts = 3
check_interval = 1m
retry_interval = 30s
}
PASSIVE Check – FRESHNESS enabled
Passive check.
When a problem is detected for the first time, HARD state is set (no SOFT state) and, therefore, sends notification notificación.
Freshness check is enabled and configured to check every 5 minutes.
template Service "passive-service-FRESHNESS" {
import "generic-service"
check_command = "passive"
enable_active_checks = true
enable_passive_checks = true
// send alert after first fail
max_check_attempts = 1
// freshness check
check_interval = 5m
}
PASSIVE check – FRESHNESS disabled
Passive check.
When a problem is detected for the first time, HARD state is set (no SOFT state) and, therefore, sends notification notificación.
Freshness check is disabled.
template Service "passive-service-NO-FRESHNESS" {
import "generic-service"
check_command = "passive"
enable_active_checks = false
enable_passive_checks = true
// send alert after first fail
max_check_attempts = 1
}
Parameters summary table
PARAMETER | APPLIED TO... | USED FOR... |
---|---|---|
enable_active_checks | active | Enable/disable active checks |
passive | Enable/disable freshness checks | |
enable_passive_checks | passive | Enable/disable passive checks |
max_check_attempts | both | Number of fails to reach HARD state after SOFT state |
retry_interval | active | Time between checks in SOFT state |
check_interval | active | Time between checks (not in SOFT state) |
check_interval | passive | Time for freshness check (check for passive updates) |