ou very rarely hear “quality of life” and “developer” in the same sentence. This is mostly because developers lack tools (not skills) that easily and concisely allow them to identify transient failures (performance problems) in complicated software stacks. Java developers, however, finally have a tool combination that allows for an unprecedented view across all layers of software that are involved in business applications?DTrace and Java SE 6. Time to find a new hobby to fill your free time.
Understanding the performance characteristics of software has always been notoriously difficult. This is especially true of languages such as Java. Developers all knew what they should be looking for, but few (if any) tools enabled them to easily test out their hypotheses. Each of the available tools fell short in some respect. The probe-effect was too high and changed the behavior of the application, or there was no seamless mechanism to observe the multiple layers of software simultaneously (Java code, JVM, native code, and the kernel), or you had to modify or restart your applications to perform the tests?which all but prevented you from debugging production performance issues.
The situation became much better for native code users (C, C++) with the release of Solaris 10 and DTrace. DTrace allows for the dynamic instrumentation of an application (meaning you can deliver fully optimized binaries and DTrace will do the instrumentation when an experiment is run) and the ability to collect arbitrary data about the application. In addition, DTrace allows for the simultaneous observation of both user-land applications and the kernel from a single tool.
However, analyzing Java application performance still left something to be desired. With J2SE 5.0, you had the ability to use DTrace to sample Java stacks, which although useful, was limited. The DVM provider for DTrace allowed more observability, but at a price?the application had to be restarted and the non-enabled probe-effect was still higher than developers would have liked.
With the release of Java SE 6, the love between Java and DTrace has become much stronger. DTrace probes are first-class citizens in the eyes of Java when run on a Solaris 10 system (regardless of SPARC/x86/x64 architecture). The DTrace probes in Java have zero disabled probe-effect; they can be enabled/disabled dynamically, and they have a much lower enabled probe-effect than the DVM provider (although an enabled probe-effect exists, but more on that later). This means that from a single tool?DTrace?you can observe your Java code in action, the JVM itself, JNI code, and the kernel to identify and correct performance problems. With this true systemic observability, it is very difficult for performance problems to hide.
DTrace Probes in Java SE
If you are familiar with DTrace, then peering into the JVM is a simple matter of learning a few new probes. If not, then you should definitely take the time to read the Solaris Dynamic Tracing Guide. Anyone who can code in Java will have no trouble picking up the D language in almost no time.
The first thing you should understand is how to enable the DTrace probes in Java SE 6. By default, a small subset of the probes that have no measurable performance impact are enabled. These probes allow you to see events like garbage collection, method compilation, thread creation, and class loading.
The probes with measurable performance impact are disabled by default. These are probes for object allocation, method invocation, and Java monitor events. You can enable them on the command line or dynamically through by using the Java jinfo command. The following are the command line options:
The first three are self-explanatory, and the last option enables all DTrace probes in the JVM. As for their performance impacts, precisely quantifying what the probe-effect will be is impossible. However, it is safe to say that it will be less than a JVMTI-based approach and somewhat proportional to the frequency at which the probes fire. So for instance, method invocation?a very frequent event?will carry a higher probe-effect than monitor probes in most cases. Therefore, it is wise to enable only those probes that you need if possible.
A Simple Java/DTrace Example
It has become a bit of a standard to use one of the Java demo applications to showcase the Java/DTrace combo. This article uses the $JDK_HOME/demo/jfc/Java2D demo as an example since it is readily available (and you get to watch surfing Duke!). If you wanted to look at object allocation in this application, you could start your process with the appropriate double X option as follows:
java -XX:+DTraceAllocProbes -jar Java2Demo.jar
Currently, only ExtendedDTraceProbes can be enabled dynamically using jinfo. If you wanted to enable all of the DTrace probes in your application after it has started, you could do that with jinfo:
jinfo -flag +ExtendedDTraceProbes
Now that you have your probes enabled, you can dive in and see what information is available. First, you need to know that the provider name for the JVM probes is hotspot (see Listing 1). Using this provider, you have access to the probes within the JVM (there is also the hotspot_jni provider for observing JNI events, but that will be left for another article).
To use the hotspot provider, you append the processes ID of the JVM of interest to the provider name “hotspot”. So, if your process ID was 116977, your provider would be “hotspot116977”. Using that, you can begin to profile your application.
If the first thing you wanted to look at was object allocation (maybe you are seeing frequent garbage collections), you would use the object-allocation probe (see Listing 2).
With this data in hand, you might be curious as to where in your application you are allocating those Integer arrays. You could follow up with another simple DTrace script that will show you your stack at the point of allocation (see Listing 3).
One of the things you might note is that there are several native code frames on this stack. That is an artifact of the implementation and it is necessary for the DTrace interaction. If you have used one of the hotspot probes (as opposed to a native probe point), you can generally ignore these native code frames.
With just a couple of simple DTrace scripts, you have identified the objects you are allocating (as well as total size over the interval) and determined where in your code you induce an allocation of one of those objects.
A More Complex Java/DTrace Example
Let’s take a look at something a bit more complex. Perhaps you are seeing monitor contention that is affecting scalability in your application. One question you might have is: “How long do I wait from the time I attempt to acquire the contended monitor until I actually enter the monitor?” Now you can create a little DTrace script to answer that question (see Listing 4).
The output of this script shows that two threads had contended monitor events. It shows the thread ID, the values (which are nanosecond power-of-two buckets), and the count (which is the number of times the value fell in a particular bucket). Thread 14 was the most impacted with eight wait times of higher than 131 microseconds.
So far, you have looked only inside the JVM. The real power of the DTrace/Java relationship is being able to correlate events from across the software stacks (see Listing 5). If you were to have a method that was utilizing underlying packages that you felt may have some JNI dependencies, you could identify what they were quite easily. For this example, you will check whether one of the methods of your demo application has any libc (an operating system library) dependencies.
The output from this script shows the function name and the number of times that it was called, but only from a thread that is currently in the renderPath method or its children. In this way, you can easily identify which native code your Java application is inducing from its use of other classes or packages. This particular case has a fairly high rate of native lock calls (mutex_lock). This is one of those areas that has always been difficult for Java developers to observe. No matter how well you write your Java code, you could be at the mercy of a scalability issue from JNI libraries. With DTrace, you can easily track these down with very little effort. In fact, for native code that may be involved in your application, a Solaris command?plockstat(1M)?will report lock contention statistics (see Listing 6).
This output (from a different application than Java2Demo, since it is more interesting) shows that you do indeed have a high contention rate on malloc_lock from libc.so. This could easily be corrected by using an alternate allocator such as libumem(3LIB). Of course, you could apply the jstack() here to monitor the Java code responsible for the native calls.
No Hiding Place for Performance Problems
With just a few simple examples, you have seen how easy it is for you to gain tremendous insight into your Java application, the JVM, or any other aspect of the software upon which your business tasks depend. With systemic observability, performance problems have nowhere to hide. Pretty amazing that all the requisite components are completely free of charge. Oh happy day!