cron vs. 'at': run command at varying intervals with self-registering 'at'
I wanted to check whether our business critical web server is up in every 6 hours.
First I thought I run ping every 6 hours via 'cron'. But I want to ping more often once server is detected down and until it comes back up.
So, I came up with this self registering 'at' script. This begins pinging the server in incremental backoff intervals once it's detected down. Starts at 1 minute, and then 2 minutes. 3,4,5... If you replace "+1" wit "*2", it will do exponential backoff. 1,2,4,8,16...
First I thought I run ping every 6 hours via 'cron'. But I want to ping more often once server is detected down and until it comes back up.
So, I came up with this self registering 'at' script. This begins pinging the server in incremental backoff intervals once it's detected down. Starts at 1 minute, and then 2 minutes. 3,4,5... If you replace "+1" wit "*2", it will do exponential backoff. 1,2,4,8,16...
$ cat ~/misc/myServerPing.at # THISFILE should be full path or relative from $HOME # Run this in bash by ". {this file}" THISFILE=misc/myServerPing.at INTERVAL=1 curl --silent --connect-timeout 8 http://ourserver.sun.com | grep "Our critical page" > /dev/null if [ $? -ne 0 ]; then date | mailx -s "Failed ping to ourserver" my.mail.address@sun.com sed "s/^\(INTERVAL=\)[1-9]*$/\\1$(($INTERVAL+1))/" $THISFILE | at now + $INTERVAL minutes > /dev/null 2>&1 else at now + 360 minutes < $THISFILE > /dev/null 2>&1 fi
isn't the role of tool like nagios? what's best one?
Posted by gerard on November 07, 2009 at 05:21 PM JST #
if the server could be down for 6hours and you not hear about it, its not very mission critical, typically mission critical stuff is checked every 1-5 minutes at least everywhere I have worked.
Posted by James Dickens on November 07, 2009 at 11:40 PM JST #