When JMX Is Overkill, Build a Lightweight Monitoring Framework

When JMX Is Overkill, Build a Lightweight Monitoring Framework

evelopers often need to monitor a set of applications and perform recovery measures in the event of a failure. In my case, I had to design a monitoring server for a multi-tier telecom backend comprised of several components distributed across multiple operating systems. I found JMX solutions to be overkill for my requirements, so I decided to implement a lightweight Java framework that could be easily extended and customized to build a monitoring system.

This article shows how to use this framework to not only monitor various backend processes like database, application, and Web servers, but also to invoke corrective actions such as relaunching the process, executing disk cleanup when disk usage thresholds are exceeded, and reporting or logging failures. You can download the framework source and sample client here.

Basics of Monitoring Component

The following are a set of common attributes for any component you may monitor:

  1. State of the component ? There are three possible statuses: Ok, Not Ok, and Could not be Monitored.
  2. Corrective action ? If the process goes down, the corrective action relaunches it. Not all the components can be recovered from failure. For instance, if an unrecoverable critical process goes down, the system can raise an alarm.
  3. Frequency of monitoring ? How often do you want the component to be monitored? The frequency counter is a configurable parameter specified in seconds or minutes depending on the type of components.
  4. Criticality ? This determines the importance of the monitored component. The framework defines three levels: High, Medium, and Low. Note that a component’s criticality can be high while the monitoring frequency is low.

Under the Framework’s Hood

As you can see in the class diagram (see Figure 1), the framework consists of four classes and one interface. The framework exposes only the Info interface and the LWMFramework class to the client code that invokes the framework. This way, the client is transparent to the internal implementation of the framework.

Click to enlarge
Figure 1. Framework Class Diagram

All the components you need to monitor must implement the Info interface. The isOk() method checks the status and the process() method handles corrective action:

public interface Info {    //checks the status     //returns true if ok     //throws Exception if it could not be monitored    public boolean isOk() throws Exception;    //handles corrective action    public boolean process();}

The framework uses two background threads:

  1. StatusMonitor for checking the status of each component
  2. StatusManager for executing corrective action and status reporting

The StatusMonitor (see Listing 1) polls every second (or configurable interval) to see if the counter of any monitorable object expired. If so, it calls the monitor() method on the monitorable object.

Listing 1. StatusMonitor Class

public class StatusMonitor extends Thread { private final int monInterval; //in milliseconds private final Monitorable[] monArray; public void run() { while(true) { for(int cnt = 0; cnt

Each monitorable object (see Listing 2) is associated with a StatusManager object and an Info object. Also, the attributes such as counter (the frequency of monitoring) and criticality are part of the monitorable object.

Listing 2. Monitorable Class

public class Monitorable { private int counter; private byte criticality; private byte prevStatus; private StatusManager manager; private Info info; public void monitor() { byte currentStatus = NOT_OK; try { if(info.isOk()) { currentStatus = Monitorable.OK; } } catch(Exception ex) { currentStatus = Monitorable.COULD_NOT_BE_MONITORED; } //call setchanged if there is change in status //or if status is not ok or cud not be monitored if((currentStatus != prevStatus) || currentStatus != StatusManager.OK) { synchronized(this) { prevStatus = currentStatus; } manager.setChanged(this); } } //return the last monitored status public synchronized byte getStatus() { return prevStatus; } //recovery action public void execute() { info.process(); }}

The monitorable object keeps track of the status changes in the Info object and notifies the StatusManager thread. The communication between StatusMonitor and StatusManager is via the monitorable object.

StatusManager (see Listing 3) receives notification when the monitorable object calls the setChanged method. It then fetches the monitorable object from the queue. If the status of the monitorable object is Not Ok, it invokes the execute() method on the monitorable object. The monitorable object in turn calls the associated Info object's process() method to perform correction action.

Listing 3. StatusManager Class

public class StatusManager extends Thread { //add to the queue and //notify that the status of monitorable has changed public synchronized void setChanged (Monitorable monitorable) monQueue.add(monitorable); notifyAll(); } //get the Monitorable object from the queue public synchronized Monitorable getMonitorable() { // return the object from the queue while(monQueue.isEmpty()) { try { wait(); } catch(Exception ex) {} } return (Monitorable) monQueue.remove(0); } //report or execute corrective action private void processChange() { Monitorable monitorable = getMonitorable(); //take corrective action if(monitorable.getStatus() == NOT_OK) { monitorable.execute(); } //log or report the status } public void run() { while(true) processChange(); }}

Click to enlarge
Figure 2. Framework Sequence Diagram

Currently, the recovery task executes in the context of the StatusManager thread. StatusManager can be enhanced to maintain a pool of threads for executing recovery tasks in a separate thread.

The startup class of the framework, LWMFramework, is used for configuring and initializing the framework. The sequence diagram in Figure 2 illustrates the interaction between various framework objects. The client code starts the framework after adding all the Info objects. During startup, the StatusManager and StatusMonitor threads are launched.


StatusManager maintains a thread-safe queue of monitorable objects. The StatusManager thread fetches the monitorable object from the queue and calls the getStatus() method on it. The access to the status variable of the monitorable object is also synchronized.

Using the Framework

Say you want to monitor the Apache Web server process and re-launch it if it goes down. To do so, create a ProcessInfo class that implements the Info interface. Code the isOk() method to iterate through the list of processes and check if the apache.exe process is running, and then code the process() method to launch the apache.exe:

public class ProcessInfo implements Info{	ProcessInfo(String processName, String execName){ ..}   //check if process is running or not	boolean isOk() throws Exception {..}    //re-launch process	boolean process() {..}; }

Currently, no API-level support exists in Java for obtaining a list of processes and disk usage, CPU load, network connectivity, etc. For my project, I developed these functions in C for each OS and provided JNI wrappers so that the Java code could access them. Although very efficient, this solution is a bit risky as the JNI calls are executed in the context of the JVM process.

Alternatively, you could have a separate process for accessing native functionality. The sample client provided with the code download includes a Win32 application CheckProcess.exe that takes the process name as a command-line parameter and writes out TRUE or FALSE on to the stdout. The isOk() method of ProcessInfo executes CheckProcess.exe and reads the status from the process's input stream:

public boolean isOk() throws Exception {    String[] cmdArray = new String[]         {“CheckProcess.exe”, “apache.exe”};    Process proc = Runtime.getRuntime().exec(cmdArray);    BufferedReader br = new BufferedReader(new         InputStreamReader(proc.getInputStream()));    String status = br.readLine();    if(status.equals("TRUE")) return true;    return false;}

Next, implement a TestClient that configures the ProcessInfo object and starts the framework:

public class TestClient {    public static void main(String[] args)         throws Exception {        ProcessInfo p = new ProcessInfo(“apache.exe”,                 “/apache.exe”);        int counter = 10; //10 seconds        byte criticality = LWMFramework.HIGH;         LWMFramework lwmf = new LWMFramework();        //set minimum monitorable frequency 1 second        lwmf.setMonitorInterval(1);        //add info object        lwmf.add(p, counter, level);        //start the framework        lwmf.start();        Thread.currentThread().join();    }}

Extending the Framework

You can extend the framework in many ways. One way is to implement new Info objects for monitoring various components like remote process, CPU load, bandwidth usage, etc. These new Info objects can be configured in a XML configuration file Monitor.xml (see Listing 4).

Listing 4. XML Configuration (Monitor.xml)

Here, each monitorable object is an XML node with common attributes like name, counter, criticality, and class. The class attribute is used for instantiating the Info object. The elements or nodes under each monitorable node are used to configure the corresponding Info object. You can implement new Info objects and add them to the Monitor.xml, which allows you to add new components without modifying the framework source.

You can further extend the framework to interface with third-party applications for generating reports and statistics. StatusManager can also provide a HTTP interface so that the components can be administered from a Web browser.

A Lightweight JMX Alternative

The monitoring framework is easy to extend and comes very handy especially in monitoring server-side applications. The lightweight framework is not a replacement for existing management tools. In fact, it is complimentary and easily can be integrated with high-end management software like HP OpenView.


Share the Post: