Nagios+NRPE+PNP4Nagios配置详解

##简介

Nagios是一个监视系统运行状态和网络信息的监视系统,能有效监控Windows、Linux和Unix的主机状态,交换机路由器等网络设置,打印机等。在系统或服务状态异常时发出邮件或短信报警第一时间通知网站运维人员,在状态恢复后发出正常的邮件或短信通知。Nagios可运行在Linux/Unix平台之上,同时提供一个可选的基于浏览器的WEB界面以方便系统管理人员查看网络状态,各种系统问题,以及日志等等。

##功能介绍

Nagios 可以监控的服务和状态主要包括以下几个方面:

  1. 监控网络服务(SMTP、POP3、HTTP、NNTP、PING等);
  2. 监控主机资源(处理器负荷、磁盘利用率等);
  3. 简单地插件设计使得用户可以方便地扩展自己服务的检测方法;
  4. 并行服务检查机制;
  5. 具备定义网络分层结构的能力,用”parent”主机定义来表达网络主机间的关系,这种关系可被用来发现和明晰主机宕机或不可达状态;
  6. 当服务或主机问题产生与解决时将告警发送给联系人(通过EMail、短信、用户定义方式);
  7. 可以定义一些处理程序,使之能够在服务或者主机发生故障时起到预防作用;
  8. 自动的日志滚动功能;
  9. 可以支持并实现对主机的冗余监控;
  10. 可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等。

##服务器环境


注:nagios要选择3.x版本,实际应用过程中发现4.x版本的nagios,pnp4nagios无法进行适配兼容,这个问题在pnp4nagios的社区论坛上有人专门问过,暂时还没解决。

##监控服务端安装

###安装基础软件

1
yum install gd fontconfig-devel libjpeg-devel libpng-devel gd-devel perl-GD openssl-devel httpd php mailx postfix cpp gcc gcc-c++ libstdc++ glib2-devel libtool-ltdl-devel perl-devel -y

###创建用户和用户组

1
2
groupadd -g 6000 nagios
useradd -u 6000 -g nagios -c "Nagios Admin" -s /sbin/nologin nagios

###编译安装nagios

1
2
3
4
5
6
7
8
9
tar -zxvf nagios-3.5.1.tar.gz
cd nagios
./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-user=nagios --enable-event-broker --enable-nanosleep --enable-embedded-perl --with-perlcache
make all
make install
make install-init
make install-commandmode
make install-webconf
make install-config

###配置Apache
①增加Apache的PHP支持
在/etc/httpd/conf/httpd.conf里找到下面一行

1
DirectoryIndex index.html index.html.var

将其修改为如下

1
DirectoryIndex index.html index.php

并增加下面一行

1
AddType application/x-httpd-php .php

②增加Nagios的WEB访问控制
在编译安装完成nagios之后,可以看到在/etc/httpd/conf.d下生成了文件nagios.conf,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# SAMPLE CONFIG SNIPPETS FOR APACHE WEB SERVER
# Last Modified: 11-26-2005
#
# This file contains examples of entries that need
# to be incorporated into your Apache web server
# configuration file. Customize the paths, etc. as
# needed to fit your system.
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
Directory "/usr/local/nagios/sbin"
# SSLRequireSSL
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
/Directory
Alias /nagios "/usr/local/nagios/share"
Directory "/usr/local/nagios/share"
# SSLRequireSSL
Options None
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user
/Directory
````
文件/usr/local/nagios/etc/htpasswd.users里存储了WEB端访问nagios的用户名和密码

使用下面的命令创建用户名密码

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagios

1
③设置权限

chown nagios /usr/local/nagios/etc/htpasswd.users

1
2
3
然后重启Apache,在WEB页面上即可访问到nagios,但是此时的nagios只是空有一个框架,还不能进行任何监控,nagios的监控需要通过各种插件($NAGIOS/libexec下的各种脚本)来实现。 

###安装nagios插件Nagios-Plugin

tar zxvf nagios-plugin-1.4.14
cd nagios-plugin-1.4.14
./configure –with-nagios-user=nagios –with-nagios-group=nagios –with-command-user=nagios –prefix=/usr/local/nagios
make all
make install
chmod 755 /usr/local/nagios

1
2
3
4
nagios-plugin安装完成,此时在目录/usr/local/nagios/libexec下就生成了各种监控脚本,如下图
![](http://easydone.qiniudn.com/nagios-02.png)
###防火墙配置
①禁用selinux

setenforce 0

1
编辑/etc/selinux/config,修改selinux参数如下

This file controls the state of SELinux on the system.

SELINUX= can take one of these three values:

enforcing - SELinux security policy is enforced.

permissive - SELinux prints warnings instead of enforcing.

disabled - No SELinux policy is loaded.

SELINUX=disabled

SELINUXTYPE= can take one of these two values:

targeted - Targeted processes are protected,

mls - Multi Level Security protection.

SELINUXTYPE=targeted

1
②编辑/etc/sysconfig/iptables,增加下面一行,以放开httpd的80端口

-A INPUT -m state –state NEW -m tcp -p tcp –dport 80 -j ACCEPT

1
2
3
4
5
6
7
8
9
###nagios目录和配置文件说明 
nagios目录的名称和作用
![](http://easydone.qiniudn.com/nagios-03.png)
②nagios配置文件及其作用
![](http://easydone.qiniudn.com/nagios-04.png)
###nagios的WEB配置
**cgi.cfg**
完成以上步骤后,重启nagios和apache,会发现已经可以在WEB页面上打开进行浏览了,但是点到具体的标签项目,会出现下面的错误提示
![](http://easydone.qiniudn.com/nagios-05.png)

It appears as though you do not have permission to view information for any of the services you requested…
If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI
and check the authorization options in your CGI configuration file.

1
此时就需要对cgi.cfg进行配置,找到下面的内容

use_authentication=1

1
2
3
4
5
6
7
8
将1修改为0即可。 

此时在WEB页面上打开nagios,已经可以看到已经监控到服务端本身的一些服务,使用的监控模板就是默认的**templates.cfg**,带参监控命令配置文件是默认的**commands.cfg**,定义的监控本机服务配置文件默认的**localhost.cfg**,在**contacts.cfg**里填上邮箱,即在遇到警告、故障、故障恢复可发送报警邮件到指定邮箱。根据实际需求可以对这些文件进行修改。

以上步骤虽然实现nagios监控,但是只是对服务端本机进行监控,还无法实现对远程服务器的监控和柱状绘图展现监控结果,对于远程服务器的监控是通过NRPE来实现的,柱状绘图展现监控结果是通过PNP4Nagios实现的,其中绘图部分有RRDTool完成。

###用PNP4Nagios和RRDTool实现绘图
①PNP4Nagios的安装

yum install rrdtool -y
tar zxvf pnp4nagios-0.6.21.tar.gz
cd pnp4nagios-0.6.21
./configure –with-nagios-user=nagios –with-nagios-group=nagios –with-rrdtool=/usr/local/rrdtool/bin/rrdtool –with-perfdata-dir=/usr/local/nagios/share/perfdata
make all
make install
make install-config
make install-init

1
2
3
4
5
②开启PNP4Nagios的日志调试功能,在文件process_perfdata.cfg中查找字符串LOG_LEVEL,将0改为2

③增加Nagios绘图活动图标,点击图标即可进入柱状图显示新页面,即在Nagios监控页面增加柱状图显示入口

修改templates.cfg,增加如下内容

define host {
name host-pnp
action_url /pnp4nagios/graph?host=$HOSTNAME$
register 0
process_perf_data 1
}

define service {
name services-pnp
action_url /pnp4nagios/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
register 0
process_perf_data 1
}

1
2
3
 **注:其中graph字符串,在PNP4Nagios的0.6*版本之前是index.php,0.6版本之后都换成了graph.**

④修改nagios.cfg,以使nagios将绘图数据输出,在nagios.cfg中查找以下几项,修改后的信息如下

process_performance_data=1
host_perfdata_command=process-host-perfdata
service_perfdata_command=process-service-perfdata
enable_environment_macros=1

1
⑤修改绘图命令,在commands.cfg中增加以下内容

‘process-host-perfdata’ command definition

define command{
command_name process-host-perfdata
command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl -d HOSTPERFDATA
}

‘process-service-perfdata’ command definition

define command{
command_name process-service-perfdata
command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl
}

1
⑥为每一个被监控的本地服务增加绘图功能,只需按照如下格式在预定义的服务名后面加上字符串**services-pnp**即可,在预定义的服务器名后加上字符串**host-pnp,**此处的host-pnp和services-pnp是与上面的templates.cfg里定义host-pnp和services-pnp对应的。

#services-pnp
define service{
use local-service,services-pnp
host_name nagios
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}

#host-pnp
define host{
use nagios,host-pnp
host_name nagios server
alias nagios server
address 127.0.0.1
}

1
⑦安装完PNP4Nagios后,在Apache配置文件包含目录/etc/httpd/conf.d里会生成PNP4Nagios的Apache支持配置文件pnp4nagios.conf,具体内容如下

SAMPLE CONFIG SNIPPETS FOR APACHE WEB SERVER

Alias /pnp4nagios “/usr/local/pnp4nagios/share”
Directory “/usr/local/pnp4nagios/share”
AllowOverride None
Order allow,deny
Allow from all
#

   # Use the same value as defined in nagios.conf
   #
   AuthName "Nagios Access"
   AuthType Basic
   AuthUserFile /usr/local/nagios/etc/htpasswd.users
   Require valid-user
    IfModule mod_rewrite.c
    # Turn on URL rewriting
    RewriteEngine On
    Options symLinksIfOwnerMatch
    # Installation directory
    RewriteBase /pnp4nagios/
    # Protect application and system files from being viewed
    RewriteRule "^(?:application|modules|system)/" - [F]
    # Allow any files or directories that exist to be displayed directly
    RewriteCond "%{REQUEST_FILENAME}" !-f
    RewriteCond "%{REQUEST_FILENAME}" !-d
    # Rewrite all other URLs to index.php/URL
    RewriteRule "^.*$" "index.php/$0" [PT]
    /IfModule
/Directory
1
2
3
4
5
6
7
8
9
10
至此对Nagios监控服务端自身监控的配置完成,效果如下图 
![](http://easydone.qiniudn.com/nagios-06.png)

![](http://easydone.qiniudn.com/nagios-07.png)

![](http://easydone.qiniudn.com/nagios-08.png)
##**远程被监控端的配置**
Nagios对远程主机的监控是通过插件Nrpe来实现的。可以这样理解,Nagios服务端使用Nrpe通过SSL的方式与客户端的Nrpe进行通信,客户端的Nrpe再发指令给Nagios-Plugin运行监控命令监控客户端的服务。这就是用Nrpe进行远程监控的原理,根据此,说明在客户端也必须要安装Nagios-Plugin,服务端也需要安装Nrpe.

①在Nagios服务端安装Nrpe

tar zxvf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure && make all
make install-plugin
make install-daemon
make install-daemon-config

1
服务端需要定义command发出指令到Nrpe,在commands.cfg里增加如下内容

#’check_nrpe’ command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

1
2
3
②上面的图表里介绍了host.cfgservices.cfg这两个文件,这里就需要用到这两个文件,文件分别定义了被监控的主机和被监控主机的服务; 

host.cfg的内容格式如下,加上host-pnp增加柱状图表输出,其中192.168.10.101和192.168.10.102分别是被监控主机的地址

define host {
use nagios,host-pnp
host_name web1
alias web1
address 192.168.10.101
}

define host {
use nagios,host-pnp host_name web2
alias web2
address 192.168.10.102
}

1
services.cfg的内容格式如下,services-pnp也已加上

define service{
use local-service,services-pnp
host_name web1
service_description LOAD
check_command check_nrpe!check_load
}

define service{
use local-service,services-pnp
host_name web1
service_description USERS
check_command check_nrpe!check_users
}

define service{
use local-service,services-pnp
host_name web1
service_description DISK
check_command check_nrpe!check_disk
}

1
2
3
③在被监控客户端上安装Nagios-Plugin和Nrpe 

Nagios-Plugin的安装同上,略去;Nrpe的安装只需在最后增加一条指令即可,将nrpe作为xinetd下的一个服务运行,如下

make install-xinetd

1
2
3
④配置被监控客户端的nrpe 

/usr/local/nagios/etc/nrpe.cfg里定义了各个被监控服务的命令,如下

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 200 -c 300
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 30% -c 20%

1
2
3
同样,各种监控脚本都在目录/usr/local/nagios/libexec里存放。

⑤定义xinetd服务支持nrpe,编辑/etc/xinetd.d/nrpe文件,修改后的文件内容如下

default: on

description: NRPE (Nagios Remote Plugin Executor)

service nrpe
{
flags = REUSE
socket_type = stream
port = 5666
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg –inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1 192.168.10.100
}

1
2
3
其中192.168.10.100是Nagios服务端的IP 

定义nrpe服务端口,在/etc/services中增加一行,如下

nrpe 5666/tcp # nrpe

1
⑥测试nrpe是否安装成功,如成功,会显示nrpe版本,如下

[root@web1 ~]# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12
```
⑦重启nrpe,service xinetd restart

至此,Nagios+Nrpe+PNP4Nagios配置就全部完成了,有一点需要注意的是所有的Nagios和PNP4Nagios的目录属主都必须是nagios用户和nagios用户组。

⑧被监控主机的页面效果如下