M/Monit | Wiki

Home / Monit / Liberty Profile Application Server

Liberty Profile Application Server

Author: Lutz Mader

Introduction

Here is an example how to configure a less or more complex IBM Liberty Profile environment.

Monit will restart the JVMs by default, in case the JVMs are stopped on demand Monit will not. Monit and the application server are running in a user context. Monit can run in a root context, but the application server should not and the used configuration and wrapper scripts must modified to handle this.

The application and http server started multiple times on the same system. It is not necessary to use the IBM Http Server, the IBM Liberty Profile is able to do the job as well, but sometimes it is useful to use the IBM Http Server in front of an IBM Liberty Profile application server.

Configuration basis

Some modifications to the ".monitrc" configuration file seems to be useful, but this depend to the used system environment.

set daemon  60              # check services at 60 seconds intervals
    with start delay 240

set logfile /home/wlpuser/logs/monit.log

set limits {
    programOutput:     1024 B,    # check program's output truncate limit
    fileContentBuffer: 1024 B,    # limit for file content test
}

set httpd port 2812 and
    use address localhost  # only accept connection from localhost
    allow localhost        # allow localhost to connect to the server and
    allow admin:monit      # require user 'admin' with password 'monit'

include /home/wlpuser/monit/config/*.cfg

The application monitor is running in the same user context as the application server. A monitor interval with 60 seconds is fast enough, but depend to the system environment.

Use a unique port if monit is started multiple times on a system and drop the "use address" statement and modify the "allow" statement if you use M/Monit.

Application Server

The application JVMs are monitored by the pid file.

check process Serv_appl1 with pidfile "/opt/IBM/wlp/servers/.pid/appl1.pid"
  start program "/usr/local/etc/monit/scripts/wlpserv.sh start" with timeout 180 seconds
  stop program "/usr/local/etc/monit/scripts/wlpserv.sh stop" with timeout 120 seconds
  restart program "/usr/local/etc/monit/scripts/wlpserv.sh restart" with timeout 300 seconds
  if not exist for 5 cycles then start
  if 5 restarts within 50 cycles then unmonitored
  group Liberty

Feel free to add additional port monitoring statements. The restart cycles depends to the application startup time, some application spend a lot of time to open the used port at startup.

  if failed host applhost.local port 9081 for 10 cycles then restart
  if failed host applhost.local port 9081 then alert

To handle the on demand stop and start requests, capture some messages from the Liberty Profile "messages.log" log file and disable or enable the monitoring. The messages are used because some other procedures are used on the one hand and on the other the operators are able to use the standard commands as well.

check file Serv_appl1_Out with path "/opt/IBM/wlp/servers/appl1/logs/messages.log"
  if not exist then exec "/usr/bin/touch /opt/IBM/wlp/servers/appl1/logs/messages.log"
#  if match "^.*SRVE....E: .*java.lang.OutOfMemoryError.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh restart Serv_appl1"
  if match "^.*SRVE0232E: Internal Server Error.*java.lang.OutOfMemoryError.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh restart Serv_appl1"
  if match "^.*CWWKE0036I: The server .* stopped after.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh unmonitor Serv_appl1"
  if match "^.*CWWKE0055I: Server shutdown requested.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh unmonitor Serv_appl1"
  if match "^.*CWWKE0001I: The server .* has been launched.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh monitor Serv_appl1"
  if match "^.*CWWKF0011I: The server .* is ready to run a smarter planet.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh monitor Serv_appl1"
  if match "^.*CWWKO0221E: TCP Channel .* initialization did not succeed.* The port might already be in use.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh restart Serv_appl1"
# Some alert only messages.
  if match "^.*SRVE0232E: Internal Server Error.*java.lang.NoClassDefFoundError.*" then alert
  if match "^.*SRVE0293E: .* java.lang.OutOfMemoryError.*" then alert
  if match "^.*SRVE8109W: Uncaught exception thrown by filter.*java.lang.OutOfMemoryError.*" then alert
  group Liberty

If you define a national language in the jvm.options file, you must pay attention to this in the message handling.

-Dfile.encoding=UTF-8 or iso-8859-1
-Duser.language=de
-Duser.region=DE

The handling of messages with some national character maybe tricky, it was difficult to find the right character sometimes. A wildcard to replace the characters is an easy way and works well (e.g. "m.*glicherweise" instead of "möglicherweise").

  if match "^.*CWWKO0221E: Die Initialisierung des TCP-Kanals .* war nicht erfolgreich.* Der Port ist m.*glicherweise bereits im Gebrauch.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh restart

Unfortunately Monit does not support counting messages based on the number of cycles well. Therefore there is no way to handle messages based on the occurrence. The number of cycles, in which the messages occur, is counted only.

  if match "^.*CWWKO0221E: TCP Channel .* initialization did not succeed.* The port might already be in use.*" then exec "/usr/local/etc/monit/scripts/wlpserv.sh restart Serv_appl1"
  if match "^.*CWWKO0221E: TCP Channel .* initialization did not succeed.* The port might already be in use.*" for 3 times within 10 cycles then exec "/usr/local/etc/monit/scripts/wlpserv.sh unmonitor Serv_appl1"

The used "wlpserv.sh" script is a simple wrapper script to call the default commands used by IBM. Some Monit environment variables are used to find the right environment used by the JVM. This is useful because some JVMs (appl1, appl2, appl3, etc.) are started on the same system.

PRG="$0"
if [ -n "$MONIT_SERVICE" ]; then
  SERV=`echo "$MONIT_SERVICE" | cut -f 1-2 -d '_'`
  SRVR=`echo "$MONIT_SERVICE" | cut -f 2 -d '_'`
  PROC=`echo "$MONIT_SERVICE" | cut -f 1-2 -d '_'`
then
  SERV=''
  SRVR=`echo "$2" | cut -f 2 -d '_'`
  PROC="$2"
fi
export WLP_USER_DIR=/opt/IBM/wlp/servers
export WLP_OUTPUT_DIR=/opt/IBM/wlp/servers
export WLP_DATA_DIR=/opt/IBM/work/wlp/data
cd /opt/IBM/wlp/bin
case "$1" in
  'start')
    ./server $1 $SRVR ;;
  'stop')
    ./server $1 $SRVR ;;
  'restart')
    if [ "$SERV" = "$PROC" ]; then
      $PRG stop
      sleep 30
      $PRG start
    else
      [ -n "$PROC" ] && /usr/local/bin/monit restart $PROC
    fi  ;;
  'monitor')
    [ -n "$PROC" ] && /usr/local/bin/monit monitor $PROC ;;
  'unmonitor')
    [ -n "$PROC" ] && /usr/local/bin/monit unmonitor $PROC ;;
  *) ;;
esac

Http Server

If you decide to use the IBM Http Server (a IBM version of the Apache Server) it is necessary to handle the process similar. The on demand stop and start requests are handles by capturing some messages from the IBM Http Server "httpd_error.log".

check file Ihs_appl1_Out with path "/opt/IBM/wlp/servers/appl1/logs/httpd_error.log"
  if not exist then exec "/usr/bin/touch /opt/IBM/wlp/servers/appl1/logs/httpd_error.log"
  if match "^.*mpm_event:notice.* AH00491: caught SIGTERM, shutting down.*" then exec "/usr/local/etc/monit/scripts/wlpihs.sh unmonitor Ihs_appl1"
  if match "^.*mpm_event:notice.* AH00489: IBM_HTTP_Server/.* configured -- resuming normal operations.*" then exec "/usr/local/etc/monit/scripts/wlpihs.sh monitor Ihs_appl1"
# Some alert only messages.
  group Liberty

Unfortunately Monit got some timing issues, sometimes. The "unmonitor" and "monitor" commands are executed in the wrong order internally when the commands are entered quickly one after the other.

Snipped from the "monit.log" log file.

[CET Feb 20 16:51:41] error    : 'Ihs_appl1_Out' content match:
[Wed Feb 20 16:50:41.791335 2019] [mpm_event:notice] [pid 38091:tid 140737352836864] AH00491: caught SIGTERM, shutting down

[CET Feb 20 16:51:41] info     : 'Ihs_appl1_Out' exec: '/usr/local/etc/monit/scripts/wlpihs.sh unmonitor Ihs_appl1'
[CET Feb 20 16:51:41] error    : 'Ihs_appl1_Out' content match:
[Wed Feb 20 16:50:49.003274 2019] [mpm_event:notice] [pid 141615:tid 140737352836864] AH00489: IBM_HTTP_Server/9.0.0.10 (Unix) configured -- resuming normal operations

[CET Feb 20 16:51:41] info     : 'Ihs_appl1_Out' exec: '/usr/local/etc/monit/scripts/wlpihs.sh monitor Ihs_appl1'
[CET Feb 20 16:51:41] info     : 'Ihs_appl1' monitor on user request
[CET Feb 20 16:51:41] info     : Monit daemon with PID 158832 awakened
[CET Feb 20 16:51:41] info     : Awakened by User defined signal 1
[CET Feb 20 16:51:41] info     : 'Ihs_appl1' unmonitor on user request
[CET Feb 20 16:51:41] info     : Monit daemon with PID 158832 awakened
[CET Feb 20 16:51:42] info     : 'Ihs_appl1' unmonitor action done
[CET Feb 20 16:51:42] info     : 'Ihs_appl1_Out' content doesn't match
[CET Feb 20 16:51:42] info     : 'Ihs_appl1_Out' content doesn't match
[CET Feb 20 16:52:42] info     : Awakened by User defined signal 1

This is not a problem, but ugly, and occurs at high system load only.

The process is monitored by the pid file and the used port.

check process Ihs_appl1 with pidfile "/opt/IBM/wlp/servers/appl1/logs/httpd.pid"
  start program "/usr/local/bin/monit/scripts/wlpihs.sh start" with timeout 120 seconds
  stop program "/usr/local/etc/monit/scripts/wlpihs.sh stop" with timeout 120 seconds
  if failed host applhost.local port 8081 for 10 cycles then restart
  if failed host applhost.local port 8081 then alert
  if not exist for 5 cycles then start
  if 5 restarts within 50 cycles then unmonitor
  group Liberty

The used "wlpihs.sh" script is a simple wrapper scripts to call the default commands used by IBM. Some Monit environment variables are used to find the right environment used by the server.

PRG="$0"
if [ -n "$MONIT_SERVICE" ]; then
  SERV=`echo "$MONIT_SERVICE" | cut -f 1-2 -d '_'`
  SRVR=`echo "$MONIT_SERVICE" | cut -f 2 -d '_'`
  PROC=`echo "$MONIT_SERVICE" | cut -f 1-2 -d '_'`
then
  SERV=''
  SRVR=`echo "$2" | cut -f 2 -d '_'`
  PROC="$2"
fi
WLP_USER_DIR=/opt/IBM/wlp/servers
IHSFILE="$WLP_USER_DIR/$SRVR/conf/httpd.conf"
cd /opt/IBM/ihs/bin;
case "$1" in
  'start')
    ./apachectl -k $1 -f $IHSFILE ;;
  'stop')
    ./apachectl -k $1 -f $IHSFILE ;;
  'restart')
    if [ "$SERV" = "$PROC" ]; then
      $PRG stop
      sleep 30
      $PRG start
    else
      [ -n "$PROC" ] && /usr/local/bin/monit restart $PROC
    fi  ;;
  'monitor')
    [ -n "$PROC" ] && /usr/local/bin/monit monitor $PROC ;;
  'unmonitor')
    [ -n "$PROC" ] && /usr/local/bin/monit unmonitor $PROC ;;
  *) ;;
esac

Take notice

The used scripts are simple wrapper scripts to call the standard commands to start and stop the servers or JVMs, applications. All the time the scripts are called with "monitor", "unmonitor" or "restart" the approbate Monit commands are used.

PROC=$MONIT_SERVICE
/usr/local/bin/monit $1 $PROC

The scripts are called with "unmonitor" to disable or "monitor" to enable the monitoring for servers or JVMs to stop or start these. "stop" or "start" are not used, this would confuse the externally running stop or start process or script. Unfortunately Monit handle unavailable applications as failed services, a stopped status is not available. But Monit handled "stop" and "unmonitor" similar and both result in a "not monitored" service, this is the reason "unmonitor" is used. To enable monitoring again "monitor" is used.

Notification

To send notifications have a look to some samples described on https://mmonit.com/wiki/Notification/Notification

To send notifications via Monit itself add an additional statement to the service check process entry above.

  if not exist then exec "/usr/local/etc/monit/scripts/zexec.sh"
     else if succeeded then exec "/usr/local/etc/monit/scripts/zexec.sh"

And change some alert statements to a script.

  if match "^.*SRVE0232E: Internal Server Error.*java.lang.NoClassDefFoundError.*" then alert

  if match "^.*SRVE0232E: Internal Server Error.*java.lang.NoClassDefFoundError.*" then exec "/usr/local/etc/monit/scripts/zexec.sh"

Sending the notifications via M/Monit is recommended and more useful.

Disclaimer

The use of the software takes place on your own risk.

Nobody can be made under any circumstances liable for damages to hardware and software, lost data and others directly or indirectly by the use of the software emerging damages.

If you do not agree with these conditions, you may not use or distribute this software.