Working of Nagios – Basic
Introduction
Nagios is a monitoring tool used to track server services, processes, disk usage, users, and system health. It helps administrators identify warnings and critical issues before they impact services or customers.
Prerequisites
- SSH access to the server
- Nagios installed
- Basic knowledge of Linux commands and configuration files
Implementation
- Nagios gives detailed information about the services running on the server. For example:
1. Number of processes
2. Number of users
3. Disk capacity
4. Other configured services - We can configure Nagios to display the following:
1. Unknown alerts
2. Warning alerts
3. Critical alerts - These alerts are generated based on the configuration defined in the Nagios setup file.
- In the event of a service failure, Nagios can alert administrators immediately, allowing remediation before outages affect users or business operations.
To Check the Nagios Configuration File
Step 1:
SSH into the server.
Step 2:
The Nagios configuration file is located at:
/usr/local/nagios/etc/nrpe.cfg
Open the file using:
vi /usr/local/nagios/etc/nrpe.cfg
Step 3:
The nrpe.cfg file contains all configured Nagios service checks and the locations of their scripts.
Example configuration:
================================================
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 2,4,8 -c 4,6,8
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 380 -c 430
command[check_smart_sda]=/usr/local/nagios/libexec/check_ide_smart -d /dev/sda -n
command[check_smart_sdb]=/usr/local/nagios/libexec/check_ide_smart -d /dev/sdb -n
command[check_var]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/sda2
command[check_tmp]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/sda5
command[check_root]=/usr/local/nagios/libexec/check_disk -w 10% -c 5% -p /dev/sda1
================================================
These are example services configured on the server. The location of the script for each command is also specified.
Note
- -w → Indicates warning level
- -c → Indicates critical level
Example Explanation
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
This command tells Nagios to monitor the number of users on the server.
- If the number of users reaches 5, Nagios shows a warning.
- If the number of users reaches 10, Nagios shows a critical alert.
The check_users comand works based on the script located at
/usr/local/nagios/libexec/check_users
The script can be modified according to requirements.
Conclusion
Nagios helps administrators monitor server health, detect issues early, and configure warning and critical thresholds for various services using the nrpe.cfg configuration file.
