Production System Concerns
Code instrumentation sounds nice in theory but it's intrusive. You may not want to mess with your byte codes. Any type of code injection or modification—no matter how benign—can potentially change the way your application behaves. Log scraping is less intrusive, but verbose traces and logs come at a cost. Any process that increases I/O (Disk or network) or takes resources away from your application, either directly or indirectly, will impact performance.
Other issues with profiling and monitoring in production environments include:
- Production platform outside your control
- You may not have the option of deploying profiling tools onto your live systems.
- Services such as rsh/telnet may be disabled or secured.
- Tools intrusiveness
- All approaches are intrusive to one extent or another—it is just a matter of degree. Tools can impact application performance. Measurement may affect the application itself or the actual metrics. Approaches such as web log scraping are external and therefore less intrusive but cannot tell you what's happening deeper inside your application.
If you built statistics collection from the ground up [e.g., via some homegrown mechanism or perhaps Aspects (AOP)], then you can leverage it in production. The extensibility and intrusiveness of such approaches depends on the design. Hard-coded approaches can be inflexible, and unless the design allows for some sort of fine-grained metric configuration and control, then new metrics may require a code change.
Information related to application and performance metrics is sometimes already present in your application logs—you just need to be able to tap into them. Log interception is a variant of log scraping. Whereas most tools process log files after the fact, log4j enables you to intercept and process log messages on the fly without incurring the cost of I/O. Log4j runs in the same process space as the application, so a log interceptor is somewhat similar to an agent. It can access the JVM for information on memory, threads, and timings. The issue with log interception is data interpretation: how to turn log messages into useful metrics.
Unlike Web server logs, application logs are application and component specific. Although log4j facilitates message formatting, text generated from an application is typically unstructured and difficult to process. Given that you know which text is significant, one way to interpret such data is using regular expressions. Regular expressions are ideally suited for matching patterns in free-form text; log messages can be filtered on the fly for text that matches metric-related inputs. Matches can then be collated with JVM information and values scraped from the message to derive application metrics.
Log interception is simple, flexible, and lightweight. The I/O overheads are low and the actual message processing very efficient as regular expressions can be pre-compiled. It is a production-friendly approach that can be plugged into mature applications without having to modify source or byte-codes.