
evelopers often need to monitor a set of applications and perform recovery measures in the event of a failure. In my case, I had to design a monitoring server for a multi-tier telecom backend comprised of several components distributed across multiple operating systems. I found JMX solutions to be overkill for my requirements, so I decided to implement a lightweight Java framework that could be easily extended and customized to build a monitoring system.
This article shows how to use this framework to not only monitor various backend processes like database, application, and Web servers, but also to invoke corrective actions such as relaunching the process, executing disk cleanup when disk usage thresholds are exceeded, and reporting or logging failures. You can download the framework source and sample client here.
Basics of Monitoring Component
The following are a set of common attributes for any component you may monitor:
- State of the component There are three possible statuses: Ok, Not Ok, and Could not be Monitored.
- Corrective action If the process goes down, the corrective action relaunches it. Not all the components can be recovered from failure. For instance, if an unrecoverable critical process goes down, the system can raise an alarm.
- Frequency of monitoring How often do you want the component to be monitored? The frequency counter is a configurable parameter specified in seconds or minutes depending on the type of components.
- Criticality This determines the importance of the monitored component. The framework defines three levels: High, Medium, and Low. Note that a component's criticality can be high while the monitoring frequency is low.