Author: David Fletcher
1. Introduction
The Monit "apache-status" protocol provides a way to monitor the internal performance of an Apache web server, and to take action early (i.e. while the server is still working) if something is going wrong. It does this through Apache's mod_status, which needs to be available to use these special Monit functions.
The original development of the Apache-Status code was in response to a failure of Apache's piped logging process which caused a server to lock up. The server was in a chroot jail, and this prevented the piped logging process from re-starting itself. The original idea was to use Monit to observe Apache from outside the chroot, and take action if it spotted a problem. However, following development of the code it became clear that many other aspects of Apache can be monitored, whether it is in a chroot jail or not.
Important: Since these tests uses mod_status, it depends on the Apache server being able to respond. It should therefore be combined with other monitoring to cover the case of a complete server or connection failure.The tests will all work with the ExtendedStatus directive On or Off.
2. Install mod_status
Apache normally compiles with mod_status enabled and built in. To access the status information the apache configuration file (often at /etc/httpd/httpd.conf or /usr/apache/conf/httpd.conf) should include these lines within one of your hosts or virtual hosts:
<Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 </Location>
The Allow statement ensures that mod_status is only available on the local machine, since it would be insecure to let everybody read the information. If Monit is connecting from a different IP number (i.e. if it is monitoring a remote machine) you should allow the IP from which Monit will connect.
3. Test mod_status
You can view the machine readable version of the Apache mod_status output for your server by entering the standard URL in a browser (Monit depends on this, rather than the human readable page):
http://www.example.co.uk/server-status?auto
This will only work from the allowed IP numbers mentioned in the section above. If everything is working well, you should see a page in your browser like the one below:
Total Accesses: 26 Total kBytes: 13 CPULoad: .0103093 Uptime: 970 ReqPerSec: .0268041 BytesPerSec: 13.7237 BytesPerReq: 512 BusyWorkers: 1 IdleWorkers: 5 Scoreboard: ____W_...........................................................
The important line is the Scoreboard, so don't worry if some of the other lines are missing. The scoreboard is where Monit gets its information about Apache. Each letter or dot represents an Apache child process, and can be decoded using this key:
"_" Waiting for Connection "S" Starting up "R" Reading Request "W" Sending Reply "K" Keepalive (read) "D" DNS Lookup, "C" Closing connection "L" Logging "G" Gracefully finishing "I" Idle cleanup of worker "." Open slot with no current process
4. Set up Monit
Once mod_status is giving its output you can ask Monit to read this information, and let you know if there is a problem. A problem is defined by using a percentage limit for the quantity monitored. This is the percentage of Apache child processes which you allow in a particular state before action is taken. A percentage is used rather than a fixed number to let the monitoring "scale" with the rise and fall in the number of Apache processes as the server load changes.
Example 1: You would like to restart the server if 60% or more Apache child precesses are simultaneously writing to the logs. Such a high percentage would probably indicate a problem with the logs, which might be cleared by restarting the server. Add this to /etc/monitrc:
check process apache with pidfile /var/run/httpd.pid start "/etc/init.d/httpd start" stop "/etc/init.d/httpd stop" if failed host 127.0.0.1 port 80 protocol apache-status loglimit > 60% then restart
Example 2: This configuration can be used to alert you if 25 percent or more of Apache child processes are stuck performing DNS lookups:
check process apache with pidfile /var/run/httpd.pid start "/etc/init.d/httpd start" stop "/etc/init.d/httpd stop" if failed host www.example.co.uk port 80 protocol apache-status dnslimit > 25% then alert
In this case restarting the server would be unlikely to solve the problem, but it would be nice to know about it before the server comes to a halt.
5. How the limits work
Action can be triggered when each measured quantity rises above or falls below the percentage limit. However, with one exception all of the percentage limits are likely to be most useful if taken as the level above which action is triggered. As in the example above, an alert is sent when greater than 25% of the Apache child processes are performing a DNS lockup.
The exception to this rule is when monitoring idle servers waiting for a connection. In this case it is much more useful to be alerted if there are too few free servers, so the action is best triggered when the measured level is less than the percentage limit.
6. What happens when the server is close to collapse?
During testing it has been found that if httpd processes become locked because, for example, they can't log, a request for the server-status page generates a new child process, and gives a correct report on the condition of the server. Only if there are very rapid incoming connections, or a very low maximum number of httpd processes, will the server-status page become inaccessible. In this case the server will be re-started or an alert will be issued because the connection to it will fail.
7. Are all these tests useful?
Some tests are most useful as "alerts" rather than server restart conditions. For example if DNS lookups are taking too much time you want to be alerted, but restarting the Apache server will not help. "or" conditions can be used to look at several conditions at once, each with different limits. However, it is a waste of processing power to monitor too many of the Apache parameters unless there is a good reason. Most useful are likely to be the logging, free servers and DNS lookup limits. The others are available for any special cases where they become relevant.
8. Available tests
The following tests can be used: loglimit, closelimit, dnslimit, keepalivelimit, replylimit, requestlimit, startlimit, waitlimit gracefullimit and cleanuplimit. Several tests can be or'ed together which we demonstrate in our final example:
check process apache with pidfile /var/run/httpd.pid start "/etc/init.d/httpd start" stop "/etc/init.d/httpd stop" if failed host www.example.co.uk port 80 protocol apache-status dnslimit > 25% or loglimit > 80% or waitlimit < 20% then alert
Disclaimer
Neither the author nor the distributors, or any other contributor of this HOWTO are in any way responsible for physical, financial, moral or any other type of damage incurred by following the suggestions in this text.