e’ve all faced those irritable questions about our applications running in production. Typically a system administrator will spring one on you on a Friday afternoon just when you’re finishing out the week with a game of foosball.
- Why did this request fail?
- What is causing so many disk IO spikes?
- What requests are failing as a result of this error?
- Why is the application running so slowly?
- Why are all the resources being gobbled up on the Web server?
These questions often make us stare blankly for a while, mumble something, and then scramble back to our cave (or server room) for hours on end trying to provide answers,
What if things could be different? What if you could give the system administrator enough information to troubleshoot most basic issues and only come to you with the tough ones? What if, when she came to you, you could flip a switch and your application would start giving you the answers and you could get back to the Friday afternoon foosball tournament?
Sounds like a dream, right? Enter the Enterprise Instrumentation Framework (EIF). The EIF allows you to generate all sorts of data about your applications executing on the production servers. With a few well-placed, single-line static method calls into the EIF API, data can be streaming out of your applications, collected, parsed and displayed to answer any number of questions. And, of course, you provide access to the switch?turning up the volume on some transactions and hitting the mute button on others.
Instrumentation and the Goals of the EIF
The word “instrumentation” makes a simple process sound very complicated (as is often the case). As you may have guessed, instrumentation is a reference to actual instruments (like a tachometer) found in the physical world. A car’s engine, for example, provides diagnostic and performance data to both drivers and mechanics. This data gets transmitted and processed by “listening” instruments. The process of embedding sensors into a system (a physical vehicle or a software application) for the purpose of diagnosing problems is referred to as instrumentation. In fact, most cars today can be connected to the ultimate instrument, a computer. This computer can be used to turn on all kinds of diagnostic information (at runtime) and then provide mechanics useful data to solve most problems without going back to the manufacturer for a fix.
In the software realm, instrumentation involves trace code (sensors) placed inside key areas of the application (components). The goal is to allow administrators (like mechanics) to turn this diagnostic information on and off as required, and thus use this information to solve issues without having to go back to the application’s developer (or manufacturer).
Sounds like a great idea?and it is! However, in the real world, instrumentation gets pushed aside and all too often left out of the majority of applications. The reasons for this are:
- The environment may not support good instrumentation
- The instrumentation code is verbose and thus compromises code readability
- The instrumentation code is hard to write, and even harder to justify given a typical, aggressive development schedule
- The instrumentation code does not provide enough good information to make it worthwhile
- The instrumentation code degrades performance
We designed the Enterprise Instrumentation Framework to eliminate every one of these reasons not to leverage instrumentation in your applications. In fact, the goals for the EIF are right in-step with countering each of these issues:
- Provide a single, unified API for outputting instrumentation data to all data stores
- Provide a simple, one-line method call for instrumenting
- Provide a lot of good information in an easy to use format
- Provide the ability to trace a business request across method calls and across process boundaries
- Allow instrumentation to be turned on/off and up/down as required
Let’s take a look at how the EIF realizes these goals.
A Single, Unified Instrumentation API
If you’ve been around long enough to get an actual application into production (congratulations!), you have undoubtedly experienced: “runs on my machine,” or “runs in the development environment” answer to the question, “why is this action failing on the production server?” To help answer this question, you might have checked out the offending source code, littered it with writes to the event log, and then shoved the DLL up to the production server. You then sit and monitor the event log. Once you gather enough information (and hopefully fix the issue) you then have to strip all the event log calls out of your code and again update the server. This can be a tedious process to be sure. EIF works in the same manner but without the tedium, and with a much more structured and controllable approach.
Without EIF, you have a lot of choices (and a big decision) to make regarding what data store you wish to target with your tracing code. You can do all of the following to trace your application:
- Write to the event log using the System.Diagnostics.EventLog class
- Fire messages to WMI using the System.Management namespace
- Use the System.IO namespace to write to a text file
- Call into the System.Messaging namespace to output information to a message queue
- Log trace messages to the database using System.Data
- Output messages with System.Diagnostics.Trace
- Target performance counters
Once you choose a means to output your tracing and target your choice’s associated event store, you are essentially stuck with that decision. All your trace code will be written to specifically target the underlying event store. In fact, you may be writing to the event log today but would prefer to write to SQL server?no small change.
Clearly, what you have now is a separate API for every event store. What you want is a common approach to allow developers to instrument their code, targeting all existing event stores (WMI, event log, text file, etc.) and all new event stores that might be created post application release. This unifying tracing API is the EIF. Here is how it works.
Instrumentation is often one of the last tasks a developer does before checking in their code (or passing code review). Like commenting, instrumentation is easy to skip when the schedule is tight and the code is difficult to implement. In addition, you don’t want your instrumentation code to be a distraction when you view the core functionality of the application. To help solve these issues, instrumentation should be easy to implement and the code should be minimal.
You will soon see that the EIF API offers just that. In fact, applying the EIF is a straight-forward, two-step process:
- Instrument your code with the EIF API calls (messages that trace important steps/data in your application)
- Create a configuration file that is used to control the amount of trace messages your application emits and to what listeners
Let’s take a look at a simple example of this in action.
The EIF in Action
To output messages to event sinks (listeners like the event log) with the EIF you simply call the static Raise method of the TraceMessageEvent class as follows:
This one line API call makes good on the promise of a simple instrumentation API.
The next step is to generate (or create) a configuration file. This configuration file will allow you (or the administrator of your application) to turn on and off the output and capturing of trace messages in your application. Like all config files in .NET, this file is XML. Let’s take a quick look at the details of the XML contained in this file:
Event Sources: Used to indicate the source (or provider) of the trace message (application is the intrinsic event source)
Event Sinks: These are the consumers of the trace messages (event log for example)
Event Categories: Used to group messages (“Errors” or “Broken Rules” for example)
Event Filters: Used to bind event categories to event sinks
Filter Bindings: Used to bind event sources to event filters
This configuration file is generated and manipulated for us at design time by the EIF. You will undoubtedly want to modify this configuration file yourself and perhaps share a common configuration file across your application. In addition, you may want to leverage the configuration API provided by the EIF to expose the contents of this file to editing by the system administrators.
Now let’s take a look at what our one line API call output to the event log. Figure 1 provides a graphical depiction of the trace message inside the event log.
|A Custom Event Source
In the previous example, you may have noticed that the event source name was simply “Application.” This is the implicit event source that gets used when calling the Raise method. However, if you want to filter and group related messages, you will want to explicitly define your event sources.
Thankfully, it is not difficult to create a custom event source with the EIF. You simply declare an EventSource type as in the following class-level declaration:
Next, to use this explicit event source, you simply call an overload of the Raise method that you used previously.
Figure 2 illustrates the results.