April 19, 2004
CommuniGate Pro Healthcheck

So here's how i run my healthcheck against CommuniGate Pro to see if all is alive and well. This Healtchcheck works by using a pair of CGP Servers which email each other on given intervals. If one of the servers doesn't receive such a "ping" email, it alerts me via SMS.

This method - which uses real emails instead of just connecting to the email-servers ports (25 for SMTP for example) - has the benefit, that it can reliably alert you in case of a "stuck" Queue, where a simple connection to port 25 would still work. A stuck Queue is most often caused by a crashed or hung external helper.

This method involves the usage of cron, a shell script to analyze the received "ping" emails and the CGPro PIPE-Module and a perl script to "pipe" the received emails directly to a file on the server's HardDisk instead of a real Mailbox.

For this Tutorial, let us assume you have two CGPro Servers, called: server1.example.com and server2.example.com which handle email for these two domains respectively. Let us also assume, that healthckeck interval of 30 Minutes is good enough for us.

Configuration on the 2 CGPro Servers

You need to configure 2 Special Router Entries on both servers. These will be similar on both of the servers. Add the foollowing line to your Router-Records on both servers:

<pings> = "queue[PROC1] piper.pl"@pipe

The above router entry will result in your server accepting emails to an account named "ping" in the PRIMARY Domain (the licensed domain) of the server. When receiving an email to this account, the server's pipe-module will pass the email to a script called piper.pl.

If you haven't done so yet, you need to configure the CommuniGate PRO Pipe Module using the Web-Admin -> Settings -> -> PIPE. For our tutorial, we assume you have configured the PIPE-Modules of both servers with the application-directory set to: /var/CommuniGate/apps After you configured the CGP Pipe, you need to make sure, that the "apps" directory exists and has the appropriate permissions.

Here's the source of the piper.pl script:

#!/usr/bin/perl
open(OUTLOG, ">apps/pings.out"); 
while (<>){ 
print OUTLOG $_; 
} 
close(OUTLOG);

Create this piper.pl script in /var/CommuniGate/apps and make it executable

chmod +x /var/CommuniGate/apps/piper.pl

For now we're done with configuring CGPro and continue with the shell script we will use to check if the "ping" emails arrive properly. We'll call this script pings.sh and here's the source:

#!/bin/sh -
#
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin
cd /var/CommuniGate/apps
if [ -f pings.out ]; then
        mv -f pings.out pings.out.old
else
        echo "`date +%Y-%m-%d_%H:%M:%S` PIPE Hangs on Server2" | /usr/bin/mail -s "CGP ALERT" someone@pager
fi
The above script will be slightly different on both servers as the text we echo to the mail command will include the server names. E.g. Server2 and Server1. Adjust the server names to your liking and make sure you use your desired email or pager address instead of someone@pager. Store this script also in /var/CommuniGate/apps and make it executable
chmod +x /var/CommuniGate/apps/pings.sh

Now we have all the scripts we need in place and can continue setting up cron to fire all this off at specified intervals.

On Server1.example.com

Edit /etc/crontab and add the following lines:
# send pings to server 2
5,35 * * * * root echo "ping" | /usr/bin/mail -s ping pings@server2.example.com
# check pings received from server2
7,37 * * * * root /var/CommuniGate/apps/pings.sh
These cron entries will send a ping to Server2 every 5 and 35 Minutes past the hour, each day and will check for received pings 2 Minutes later at 7 and 37 Minutes past the hour.

On Server2.example.com

Edit /etc/crontab and add the following lines:

# send pings to server 1
5,35 * * * * root echo "ping" | /usr/bin/mail -s ping pings@server1.example.com
# check pings received from server1
7,37 * * * * root /var/CommuniGate/apps/pings.sh
Same as on Server1 just using a different email address to send the pings to.

Almost Done

This is all there is to this healthcheck setup. But read on!

PLEASE don't email me as soon as you hit some wall when you try to implement this. A day only has 24 Hours - also for me.
So if you run into problems, please read this tutorial again and make sure you really followed it to the point. It took me quite some time to write this down, and i'd really appreciate if you could take at least the same amount of time reading this.
Thanks!

NOTES: Please note, that in the event of a hung Queue, all "ping" mails will PILE up and will get delivered in ONE go, after the Queue is working again. This will probably result in a bunch of PING Mails being sent to your pager. If for instance the Queue on Server1 hangs, it can't receive nor delive ping mails. This it thinks Server2 hangs and Queues up warning emails to your pager (or alert email-address). Once the Queue is working again, you'll get these emails which tell you Server2 is hung where in fact Server1 was hung. But these false alarms are easily identifyable.

Posted in: by seiz | Comments (0)