CHECK_NRPE: Error - Could not complete SSL handshake

76.5k views Asked by At

I have NRPE daemon process running under xinetd on amazon ec2 instance and nagios server on my local machine.

The check_nrpe -H [amazon public IP] gives this error:

CHECK_NRPE: Error - Could not complete SSL handshake.

Both Nrpe are same versions. Both are compiled with this option:

./configure  --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/i386-linux-gnu/

"allowed host" entry contains my local IP address.

What could be the possible reason of this error now??

15

There are 15 answers

0
drewboswell On BEST ANSWER

To check if you have access to it at all attempt a simple telnet on the address:port, a ping or traceroute to see where it is blocking.

telnet IP port
ping IP
traceroute -p $port IP

Also check on the target server that the nrpe daemon is working properly.

netstat -at | grep nrpe

You also need to check the versions of OpenSSL installed on both servers, as I have seen this break checks on occasion with the SSL handshake!

0
Michael Guthrie On

That's somewhat of a catch-all error message for NRPE. Check your firewall rules and make sure that port is open. Also try disabling SELinux and seeing if that lets the connection through. It's likely not an SSL issue, but just an issue with the connection being refused.

1
NOZUONOHIGH On

@jgritty was right. you should edit nrpe.cfg and nrpe config files to allow your master nagios server's access:

vim /usr/local/nagios/etc/nrpe.cf
allowed_hosts=127.0.0.1,172.16.16.150

and

vim /etc/xinetd.d/nrpe
only_from= 127.0.0.1 172.16.16.150
0
Florian Brucker On

If you are running Debian 9 then there is a known issue regarding this problem, caused by OpenSSL dropping support for the method NRPE uses to initiate anonymous SSL connections.

The issue seems to be fixed but the fix hasn't made it into the official packages, yet.

Currently there seems to be no secure work-around.

0
Gene Brotherton On

I'm running nrpe using the xinetd service.

Make sure also (in addition to the above basic steps) that your nagios user is authenticating properly. In my case:

Jun  6 15:05:52 gse2 xinetd[33237]: **Unknown user: nagios**<br>[file=/etc/xinetd.d/nrpe] [line=9]
Jun  6 15:05:52 gse2 xinetd[33237]: Error parsing attribute user - DISABLING
SERVICE [file=/etc/xinetd.d/nrpe] [line=9]
Jun  6 15:05:52 gse2 xinetd[33237]: **Unknown group: nagios**<br>[file=/etc/xinetd.d/nrpe] [line=10]
Jun  6 15:05:52 gse2 xinetd[33237]: Error parsing attribute group - DISABLING
SERVICE [file=/etc/xinetd.d/nrpe] [line=10]
Jun  6 15:05:52 gse2 xinetd[33237]: Service nrpe missing attribute user - DISABLING

Was showing in the /var/log messages.
It escaped me at first, but then I did a check on ypbind service and found it was not started.
After starting ypbind, nagios user and group was authenticating properly, the error went away.

0
Mayank Jaiswal On

For me setting the following in /etc/nagios/nrpe.cfg on Client worked:

dont_blame_nrpe=1

It's and ubuntu 16.04 machine. For other possible problems, I recommend looking at nrpe logs. Here is good article for configuring logs.

0
em110905 On

It looks like you are running your Nagios server in a virtual machine on a host-only network. If this is so, this would stop any external access. Ensure that you have a NAT or Bridged Network available.

0
jgritty On

If you are running nrpe as a service, make sure you have this line in your nrpe.cfg on the client side:

# example 192. IP, yours will probably differ
allowed_hosts=127.0.0.1,192.168.1.100 

You say that is done, however, if you are running nrpe under xinetd, make sure to edit the only_from directive in the file /etc/xinetd.d/nrpe.

Don't forget to restart the xinetd service:

service xinetd restart
1
Özgür On

check your /var/sys/system.log . In my case, it turned out my monitored IP was set to something else than the one I set in nrpe.cfg file. I don't know the cause of this change, though.

0
user2315218 On

Make sure that you have restarted the Nagios Client Plugin as well.

0
decimal On

check configuration in /etc/xinetd.d/nrpe and verify the server IP. If it is showing only_from = 127.0.0.1 change it with Server IP .

0
SielaQ On

some edge cases restarting nagios-nrpe-server doesn't help, due to the fact that process was not killed or it was not properly restarted.

just kill it manually then, and start.

0
pr0t On

In my case the problem had nothing to do with either of the configuration files (neither xinetd nor nrpe.cfg) - the problem has already been solved but one might find this solution helpful.

I had two clients (check_nrpe) - v4.X on a new Nagios server, v2.X on an old one - the old client was working fine with an old xinetd-based nrpe daemon running on Server X. The new client (v4.X) generated the SSL error while trying to communicate with Server X - what I did was the following:

  1. I copied the old check_nrpe (v2.X) from the old Nagios server to <nagios_dir>/libexec/check_nrpe_legacy on the new Nagios server

  2. I had access to all the *.so libraries the old check_nrpe used so I copied them as well and put into /lib64 on the new Nagios server (remember to use proper symbolic links - only putting libs into /lib64 would not be enough; use ldd ./check_nrpe to find what links you need)

  3. I updated ldconfig cache (execute ldconfig command, no parameters)

  4. I once again verified via ldd ./check_nrpe if I had all the libraries required for the older client to run on the new server

  5. I defined a new command check_nrpe_legacy:

define command {
            command_name    check_nrpe_legacy
            command_line    $USER1$/check_nrpe_legacy/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

This is how I moved the old client onto the new server and made it work. The problem was gone.

0
Ricky On

SSL handshake error msg.Beside the allow_host you should assign.

your nagios server is in a local lan with C type ip address such as 192.168.xxxx

when the target monitored server feedback the ssl msg to your local nagios server,the message should first comes to your public IP of your line,the message cannot across the public IP into your nagios server which ip is an internal one.

you need NAT to guide the SSL message from target server to inner nagios server.

Or you better use "GET" method which just get monitor message from the nagios client side,such as SNMP to fulfill the remote monitor of local resource of linux servers.

SSL need feedback in double direction.

Best Regards

1
dovetalk On

So many answers, none of them hit the reason why I ran into this issue.

It turns out that nagios has terrible cross-version support and this was caused by me having a version 2 "client" (machine being monitored) and a version 3 "server" (monitoring machine).

Once I upgraded the client to version 3, the problem went away and I could do a check_nrpe -H [client IP] without issues.

Note that I am not sure if client/server are the right terms with nagios, as in the case of an NRPE call, the server is really the machine being called, but I digress.