Contents
Monit
M/Monit
Wiki
Monit.ConfigurationExamples History
Hide minor edits - Show changes to output
April 10, 2013, at 08:25 AM CEST
by - Updates to example for core dump analysis (linux)
Changed lines 1069-1074 from:
sysctl
echo "ulimit -c unlimited" >> /etc/sysconfig/httpd
echo "CoreDumpDirectory /var/crash/core" >
to:
# pattern is: core.<executable>.<timestamp>.<pid>
sysctl -w kernel.core_pattern=/var/crash/core/core.%e.%t.%p
echo -e "bt\nquit" > /etc/gdb.batch
echo "ulimit -c unlimited" >> /etc/sysconfig/httpd
echo "CoreDumpDirectory /var/crash/core" > /etc/httpd/conf.d/core.conf
sysctl -w kernel.core_pattern=/var/crash/core/core.%e.%t.%p
echo -e "bt\nquit" > /etc/gdb.batch
echo "ulimit -c unlimited" >> /etc/sysconfig/httpd
echo "CoreDumpDirectory /var/crash/core" > /etc/httpd/conf.d/core.conf
Changed line 1085 from:
if changed timestamp then exec "/bin/bash -c 'if [ `/bin/cat /tmp/monit_httpd_core.tmp | head -1` != `/bin/ls /var/crash/core/core.httpd* | tail -1` ]; then /usr/bin/gdb -x /etc/gdb.batch /usr/sbin/httpd `/bin/ls /var/crash/core/core.httpd* | tail -1 | tee /tmp/monit_httpd_core.tmp` | mail -s httpd_crash admin@foo.bar webmaster@foo.bar; fi'"
to:
if changed timestamp then exec "/usr/local/etc/monit/scripts/httpd_core_analysis.sh"
Changed lines 1088-1093 from:
[[#tcpdump1]] Start and stop tcpdump based on condition
As soon as the remote SMTP service of host bar is not available tcpdump is started
When the connection is available again, tcpdump is stopped. Only first
ocurrence is catched (noexec flag is created to prevent another outage monitoring).
to:
Script @@/usr/local/etc/monit/scripts/httpd_core_analysis.sh@@
Deleted lines 1089-1102:
if failed port 25 protocol smtp then exec "/bin/bash -c 'if [ ! -f /tmp/noexec ]; then touch /tmp/noexec; tcpdump -w /tmp/foo_bar.dump host bar; fi'" else if recovered then exec "killall tcpdump"
@]
[[#tcpdump2]] Rotate tcpdump until condition occures
This allows to let tcpdump write the data to file and rotate it to keep the size of
the dump small until network problem occures (we don't need to flood the filesystem
with data which are ok). As soon as the problem occures, monit sets noexec flag
=> the dump contains the data which preceded the problem as well.
Script for tcpdump and rotation created (/tmp/dumprotate):
[@
Changed lines 1091-1092 from:
to:
MONIT_HTTPD=/tmp/monit_httpd_timestamp.tmp
BIN_HTTPD=/usr/sbin/httpd
if [ -f $MONIT_HTTPD ]
Changed lines 1096-1099 from:
to:
for core in `find /var/crash/core -type f -name core.httpd\* -newer $MONIT_HTTPD`
do
( date; ls -l $core; /usr/bin/gdb -batch -x /etc/gdb.batch $BIN_HTTPD $core; echo ) | mail -s httpd_crash admin@foo.bar webmaster@foo.bar
done
do
( date; ls -l $core; /usr/bin/gdb -batch -x /etc/gdb.batch $BIN_HTTPD $core; echo ) | mail -s httpd_crash admin@foo.bar webmaster@foo.bar
done
Added line 1101:
touch $MONIT_HTTPD
Changed lines 1104-1108 from:
to:
[[#tcpdump1]] Start and stop tcpdump based on condition
As soon as the remote SMTP service of host bar is not available tcpdump is started.
When the connection is available again, tcpdump is stopped. Only first
ocurrence is catched (noexec flag is created to prevent another outage monitoring).
As soon as the remote SMTP service of host bar is not available tcpdump is started.
When the connection is available again, tcpdump is stopped. Only first
ocurrence is catched (noexec flag is created to prevent another outage monitoring).
Changed lines 1110-1111 from:
to:
check host bar with address 10.1.1.2
if failed port 25 protocol smtp then exec "/bin/bash -c 'if [ ! -f /tmp/noexec ]; then touch /tmp/noexec; tcpdump -w /tmp/foo_bar.dump host bar; fi'" else if recovered then exec "killall tcpdump"
if failed port 25 protocol smtp then exec "/bin/bash -c 'if [ ! -f /tmp/noexec ]; then touch /tmp/noexec; tcpdump -w /tmp/foo_bar.dump host bar; fi'" else if recovered then exec "killall tcpdump"
Changed lines 1114-1116 from:
(with 5 minutes extent
to:
[[#tcpdump2]] Rotate tcpdump until condition occures
This allows to let tcpdump write the data to file and rotate it to keep the size of
the dump small until network problem occures (we don't need to flood the filesystem
with data which are ok). As soon as the problem occures, monit sets noexec flag
=> the dump contains the data which preceded the problem as well.
Script for tcpdump and rotation created (/tmp/dumprotate):
Changed lines 1124-1125 from:
to:
#!/bin/bash
killall tcpdump
if [ ! -f /tmp/noexec ]
then
tcpdump -w /tmp/foo_bar.dump host bar
fi
killall tcpdump
if [ ! -f /tmp/noexec ]
then
tcpdump -w /tmp/foo_bar.dump host bar
fi
Changed lines 1132-1139 from:
[[#mysqldproc]] MySQL event driven process list
This allows to obtain process list of mysql threads as soon as mysql
refuses connections. For example we needed to know why mysql
returned "Too many connections" to clients occasionaly. (note that
for simplicity in this example is showed mysql root account without
password - you realy should use restricted account ;)
to:
The script is started from cron each 30 minutes:
Changed lines 1134-1135 from:
if failed port 3306 protocol mysql then exec "/bin/bash -c '(date && /usr/bin/mysqladmin -u root processlist && echo) >> /tmp/mysql_processlist'"
to:
0,30 * * * * /tmp/dumprotate
Changed lines 1137-1139 from:
to:
Monit watches the host availablity and as soon as it failed, sets noexec flag
(with 5 minutes extent):
(with 5 minutes extent):
Changed lines 1141-1149 from:
missingok
size 100k
postrotate
endscript
}
to:
check host bar with address 10.1.1.2
if failed port 25 protocol smtp then exec "/bin/bash -c 'sleep 300; touch /tmp/noexec'"
if failed port 25 protocol smtp then exec "/bin/bash -c 'sleep 300; touch /tmp/noexec'"
Changed lines 1145-1152 from:
[[#topsnap]] Getting top otput by mail on event
to:
[[#mysqldproc]] MySQL event driven process list
This allows to obtain process list of mysql threads as soon as mysql
refuses connections. For example we needed to know why mysql
returned "Too many connections" to clients occasionaly. (note that
for simplicity in this example is showed mysql root account without
password - you realy should use restricted account ;)
Changed lines 1154-1155 from:
check file myfile with path /tmp/fo.bar
if changed timestamp then exec "/bin/bash -c 'top -bn1 | mail -s top admin@foo.bar'"
to:
check process mysqld with pidfile /var/run/mysqld.pid
if failed port 3306 protocol mysql then exec "/bin/bash -c '(date && /usr/bin/mysqladmin -u root processlist && echo) >> /tmp/mysql_processlist'"
@]
[[#logrotate]] Logrotate configuration for monit
[@
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/kill -HUP `cat /var/run/monit.pid 2>/dev/null` 2> /dev/null || true
endscript
}
@]
[[#topsnap]] Getting top output by mail on event
[@
check file myfile with path /tmp/foo.bar
if failed port 3306 protocol mysql then exec "/bin/bash -c '(date && /usr/bin/mysqladmin -u root processlist && echo) >> /tmp/mysql_processlist'"
@]
[[#logrotate]] Logrotate configuration for monit
[@
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/kill -HUP `cat /var/run/monit.pid 2>/dev/null` 2> /dev/null || true
endscript
}
@]
[[#topsnap]] Getting top output by mail on event
[@
check file myfile with path /tmp/foo.bar
Changed line 152 from:
check process snmpd with pidfile /var/run/snmpd
to:
check process snmpd with pidfile /var/run/snmpd.pid
Changed line 1165 from:
''mbmon'' required
to:
'''mbmon''' required
Changed lines 1179-1180 from:
'''Note:''' Read about ''mbmon'' before use it. It can crash your system.
to:
'''Note:''' Read about '''mbmon''' before use it. It can crash your system.
Changed line 1182 from:
''smartmontools'' required
to:
'''smartmontools''' required
Changed lines 74-76 from:
to:
->[[#CPUTemp | Monitor CPU Temperature]]
->[[#HDDTemp | Monitor HDD Temperature ]]
->[[#HDDTemp | Monitor HDD Temperature ]]
Changed lines 1161-1196 from:
to:
@]
[[#CPUTemp]] Monitor CPU Temperature
''mbmon'' required
[@
check program CPU with path "/usr/local/etc/monit/scripts/cpu_temp.sh"
if status > 60 then alert
group temperature
@]
Script @@/usr/local/etc/monit/scripts/cpu_temp.sh@@
[@
#!/bin/sh
TP=`mbmon -c 1 -r | grep TEMP1 | awk '{ printf "%d",$3 }'`
#echo $TP
exit $TP
@]
'''Note:''' Read about ''mbmon'' before use it. It can crash your system.
[[#HDDTemp]] Monitor HDD Temperature (/dev/ada0 in example)
''smartmontools'' required
[@
check program HDD_80 with path "/usr/local/etc/monit/scripts/ada0_temp.sh"
if status > 45 then alert
group temperature
@]
Script @@/usr/local/etc/monit/scripts/ada0_temp.sh@@
[@
#!/bin/sh
TP=`/usr/local/sbin/smartctl -a /dev/ada0 | grep Temp | awk -F " " '{printf "%d",$10}'`
echo $TP # for debug only
exit $TP
@]
'''Note:''' Don't forget to enable SMART. Run @@/usr/local/sbin/smartctl -s on /dev/ada0@@ on every boot.
[[#CPUTemp]] Monitor CPU Temperature
''mbmon'' required
[@
check program CPU with path "/usr/local/etc/monit/scripts/cpu_temp.sh"
if status > 60 then alert
group temperature
@]
Script @@/usr/local/etc/monit/scripts/cpu_temp.sh@@
[@
#!/bin/sh
TP=`mbmon -c 1 -r | grep TEMP1 | awk '{ printf "%d",$3 }'`
#echo $TP
exit $TP
@]
'''Note:''' Read about ''mbmon'' before use it. It can crash your system.
[[#HDDTemp]] Monitor HDD Temperature (/dev/ada0 in example)
''smartmontools'' required
[@
check program HDD_80 with path "/usr/local/etc/monit/scripts/ada0_temp.sh"
if status > 45 then alert
group temperature
@]
Script @@/usr/local/etc/monit/scripts/ada0_temp.sh@@
[@
#!/bin/sh
TP=`/usr/local/sbin/smartctl -a /dev/ada0 | grep Temp | awk -F " " '{printf "%d",$10}'`
echo $TP # for debug only
exit $TP
@]
'''Note:''' Don't forget to enable SMART. Run @@/usr/local/sbin/smartctl -s on /dev/ada0@@ on every boot.
Changed line 999 from:
[[#subsonic]] Subsonic (straming app - daemon version)
to:
[[#subsonic]] Subsonic (streaming app - daemon version)
February 17, 2012, at 07:56 PM CET
by - added subsonic
Added line 72:
->[[#subsonic| Subsonic, gnu streaming app like Spotify.]]
Added lines 997-1003:
@]
[[#subsonic]] Subsonic (straming app - daemon version)
[@
check process streaming with pidfile /var/run/subsonic.pid
start program = "/etc/init.d/subsonic start"
stop program = "/etc/init.d/subsonic stop"
[[#subsonic]] Subsonic (straming app - daemon version)
[@
check process streaming with pidfile /var/run/subsonic.pid
start program = "/etc/init.d/subsonic start"
stop program = "/etc/init.d/subsonic stop"
January 14, 2012, at 08:38 PM CET
by - fixes for kissdx configuration
Changed line 1000 from:
check process kissdx with pidfile /var/run/kissdx
to:
check process kissdx with pidfile /var/run/kissdx.pid
Changed line 1002 from:
stop program = "/usr/bin/killall kissdxp"
to:
stop program = "/usr/bin/killall kissdx"
Changed line 342 from:
[[#Nginx]] NginX (web server)
to:
[[#NginX]] NginX (web server)
Added line 33:
->[[#NginX | NginX(web server)]]
Added lines 340-350:
@]
[[#Nginx]] NginX (web server)
[@
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
group www-data (for ubuntu, debian)
[[#Nginx]] NginX (web server)
[@
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start"
stop program = "/etc/init.d/nginx stop"
group www-data (for ubuntu, debian)
June 05, 2010, at 06:32 PM CEST
by - added entry for Dovecot IMAP server using SSL
Added line 38:
->[[#dovecot | Dovecot (imap secure server)]]
Added lines 437-452:
@]
[[#dovecot]] Dovecot (imap secure server)
[@
check process dovecot with pidfile /var/run/dovecot/master.pid
start program = "/etc/init.d/dovecot start"
stop program = "/etc/init.d/dovecot stop"
group mail
if failed host mail.yourdomain.tld port 993 type tcpssl sslauto protocol imap for 5 cycles then restart
if 3 restarts within 5 cycles then timeout
depends dovecot_init
depends dovecot_bin
check file dovecot_init with path /etc/init.d/dovecot
group mail
check file dovecot_bin with path /usr/sbin/dovecot
group mail
[[#dovecot]] Dovecot (imap secure server)
[@
check process dovecot with pidfile /var/run/dovecot/master.pid
start program = "/etc/init.d/dovecot start"
stop program = "/etc/init.d/dovecot stop"
group mail
if failed host mail.yourdomain.tld port 993 type tcpssl sslauto protocol imap for 5 cycles then restart
if 3 restarts within 5 cycles then timeout
depends dovecot_init
depends dovecot_bin
check file dovecot_init with path /etc/init.d/dovecot
group mail
check file dovecot_bin with path /usr/sbin/dovecot
group mail
February 15, 2010, at 12:34 AM CET
by - adddes kissdx
Added lines 68-69:
->[[#amule | aMule, p2p app.]]
->[[#kissdx | Kissdx, network streaming server for some DVDs]]
->[[#kissdx | Kissdx, network streaming server for some DVDs]]
Added lines 967-974:
@]
[[#kissdx]] kissdx (Streaming app for some DVDs)
[@
check process kissdx with pidfile /var/run/kissdx
start program = "/etc/init.d/kissdx"
stop program = "/usr/bin/killall kissdxp"
if 5 restarts within 5 cycles then timeout
[[#kissdx]] kissdx (Streaming app for some DVDs)
[@
check process kissdx with pidfile /var/run/kissdx
start program = "/etc/init.d/kissdx"
stop program = "/usr/bin/killall kissdxp"
if 5 restarts within 5 cycles then timeout
October 09, 2009, at 06:26 PM CEST
by - added top output by mail
Changed lines 68-69 from:
to:
->[[#topsnap | Getting top otput by mail on event]]
Added lines 1107-1110:
[[#topsnap]] Getting top otput by mail on event
[@
check file myfile with path /tmp/fo.bar
if changed timestamp then exec "/bin/bash -c 'top -bn1 | mail -s top admin@foo.bar'"
[@
check file myfile with path /tmp/fo.bar
if changed timestamp then exec "/bin/bash -c 'top -bn1 | mail -s top admin@foo.bar'"
September 13, 2009, at 04:21 PM CEST
by - Added FreeRADIUS example
Added lines 21-22:
*AAA Services
->[[#radius | FreeRADIUS]]
->[[#radius | FreeRADIUS]]
Added lines 177-187:
if 5 restarts within 5 cycles then timeout
@]
!!AAA Services
[[#radius]] FreeRADIUS (SVN only, not Monit 5.0)
[@
check process radiusd with pidfile /var/named/chroot/var/run/radiusd/radiusd.pid
start program = "/etc/init.d/radiusd start"
stop program = "/etc/init.d/radiusd stop"
if failed host 127.0.0.1 port 1812 type udp protocol radius secret testing123 then alert
if failed host 127.0.0.1 port 1812 type udp protocol radius secret testing123 then alert
@]
!!AAA Services
[[#radius]] FreeRADIUS (SVN only, not Monit 5.0)
[@
check process radiusd with pidfile /var/named/chroot/var/run/radiusd/radiusd.pid
start program = "/etc/init.d/radiusd start"
stop program = "/etc/init.d/radiusd stop"
if failed host 127.0.0.1 port 1812 type udp protocol radius secret testing123 then alert
if failed host 127.0.0.1 port 1812 type udp protocol radius secret testing123 then alert
June 25, 2009, at 06:26 PM CEST
by - Added configuration for mongrel cluster
Added line 27:
->[[#mongrelcluster | Mongrel Cluster]]
Added lines 242-259:
[[#mongrelcluster]] Mongrel Cluster
Each mongrel instance will need it's own entry, and make sure to change the port (8000 in this example) to reflect your mongrel_cluster.yml file.
[@
check process mongrel8000
with pidfile /path/to/pidfile/mongrel.8000.pid
group mongrels
start program = "/bin/mongrel_rails cluster::start -C /path/to/mongrel_cluster.yml --clean --only 8000"
stop program = "/bin/mongrel_rails cluster::stop -C /path/to/mongrel_cluster.yml --clean --only 8000"
if failed port 8000 protocol HTTP
request /system/token
with timeout 10 seconds
then restart
if 5 restarts within 5 cycles
then timeout
@]
'''Note:''' @@/system/token@@ requests an empty file called @@token@@, as recommended in the apache section above.
June 04, 2009, at 09:10 AM CEST
by - fixed the first apache example. I don't know how to do the SSL one
Changed lines 218-219 from:
if failed host 192.168.1.1 port 80
protocol HTTP requesthttp://localhost/~hauk/monit/token then restart
protocol HTTP request
to:
if failed host localhost port 80
protocol HTTP request "/~hauk/monit/token" then restart
protocol HTTP request "/~hauk/monit/token" then restart
December 04, 2008, at 04:35 PM CET
by - aMule example
Added lines 925-931:
@]
[[#amule]] aMule (p2p program - daemon version)
[@
check process aMule with pidfile /home/$USER/.aMule/muleLock
start program = "/etc/init.d/amule-daemon start"
stop program = "/etc/init.d/amule-daemon stop"
[[#amule]] aMule (p2p program - daemon version)
[@
check process aMule with pidfile /home/$USER/.aMule/muleLock
start program = "/etc/init.d/amule-daemon start"
stop program = "/etc/init.d/amule-daemon stop"
Changed lines 64-65 from:
to:
->[[#logrotate | Logrotate configuration]]
Added lines 1053-1066:
@]
[[#logrotate]] Logrotate configuration for monit
[@
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/kill -HUP `cat /var/run/monit.pid 2>/dev/null` 2> /dev/null || true
endscript
}
[[#logrotate]] Logrotate configuration for monit
[@
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/kill -HUP `cat /var/run/monit.pid 2>/dev/null` 2> /dev/null || true
endscript
}