Alan Bateman

Main | Next month (Dec 2005) »

http://blogs.sun.com/alanb/date/20051128 Monday November 28, 2005

OutOfMemoryError looks a bit better!

OutOfMemoryError has always been a confusing error. For a long time the HotSpot Virtual Machine threw this error without a detail message or stack trace so typically the thread throwing the error would terminate with this:

  Exception in thread "main" java.lang.OutOfMemoryError

That was confusing. Is my java heap full or does it mean something else? Those familiar with the heap layout that the HotSpot VM uses will know that the "something else" might mean the "permanent generation". This is place where reflective data such as class and method objects are allocated. It is also the place where interned strings are stored. If you've got an application that loads a huge number of classes or interns millions of strings then it's possible that the OutOfMemoryError is because the permanent generation is full rather than the java heap.

In 5.0 the error is less confusing as there is a detail message. This means you will see something like this:

  Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
or:
  Exception in thread "main" java.lang.OutOfMemoryError: PermGen full

In Mustang, there has been further changes to the way that OutOfMemoryError is reported. One obvious one is that the HotSpot VM will attempt to include a stack trace. This means you should see something like this:

  Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at ConsumeHeap$BigObject.(ConsumeHeap.java:22)
        at ConsumeHeap.main(ConsumeHeap.java:47)

In this example we see the stack trace where the allocation failed. If the OutOfMemoryError is because the perm gen it full then you might see String.intern or ClassLoader.defineClass near the top of the stack.

Is a stack trace useful? In some limited cases it can be. For example, suppose you've got a thread looping and in the loop it is allocating objects and putting them into a collection. In that case the stack trace might direct you to a good starting place. More generally, the stack trace is not likely to be useful. If you've got a busy application and the heap is nearly full, then OutOfMemoryError will be likely be thrown in some random place by some random mutator.

One useful improvement is the -XX:+HeapDumpOnOutOfMemoryError option which tells the HotSpot VM to generate a heap dump when an allocation from the java heap or the permanent generation cannot be satisfied. There isn't any overhead to running with this option so it should be useful for production systems where OutOfMemoryError takes weeks (or longer) to surface. The heap dump is in HPROF binary format so it can be analyzed using any tools that can import this format. If you read Sundar's blog then you'll know that Mustang includes a useful tool called jhat which can be used to do rudimentary analysis of the dump. jhat supports Object Query Language so you can easily create your own queries and mine the heap dump.

The next improvement is visible when the system is almost out of swap space and an allocation from the native heap fails in the VM. In that case the VM aborts and you get a one-line error such as the following:

  Exception java.lang.OutOfMemoryError: requested 16 bytes for CHeapObj-new. Out of swap space?

This message has been known to confuse a lot of developers. At first glance it might it might look like that the java heap is full. Of course if you increase the size of the heap with the -mx option it might make the problem worse as the larger java heap means there is less native memory available.

In Mustang this condition will trigger the VM to invoke the fatal error handling mechanism. This means you should get a fatal error log as you would get with a normal (abnormal?) crash. The fatal error log is named hs_err_<pid>.log and contains useful information about the thread, process, and system at the time of the crash. In the case of native heap exhaustion then the heap memory and memory map information in the log can be useful. The exact format is version and platform specific and you will get more information in the J2SE Troubleshooting Guide.

Hopefully, developers will find these improvements useful. It should make OutOfMemoryError a little less confusing and it means the error is no longer a one-line wonder.

http://blogs.sun.com/alanb/date/20051121 Monday November 21, 2005

Two fine demos

I was in New York last week, and attended a Java SE User Group meeting. After the scheduled talks there was open discussion on a wide range of topics. One of those topics related to tutorials, documentation, and sample code. It is always interesting to hear how people learn new APIs. Some people start with the API documentation, some google for sample code, and books continue to be popular. The JDK "demo" directory was mentioned and a number of people remarked that they hadn't checked it out in years because the demo directory appeared to be stale. There are indeed some old demos in there but there are a number of more recent examples too. Today I thought I might highlight two of the more useful examples.


The first is JTop. You'll find it in %JDK_HOME%\demo\management\JTop (or $JDK_HOME/demo/management/JTop) where JDK_HOME is C:\jdk1.6 or wherever you have installed the JDK. Mandy Chung added this demo in Mustang b59 and it will work on JDK5.0 too. There's a README.txt file in the directory which explains what it does. In brief, it monitors the CPU usage of the threads in a remote application. If you've ever had to deal with a looping process and you tried to figure out which thread is looping then you will relate to this.

In API terms, this example is a demonstration of the monitoring and management package. If the Java virtual machine is able to measure thread CPU time then getThreadCpuTime method can be used to obtain the CPU time for a thread.

To use JTop, we start the application with the JMX agent configured for remote management. For demonstration purposes both authentication and SSL are disabled (which you won't want to do in production).

   java -Dcom.sun.management.jmxremote.port=7000
        -Dcom.sun.management.jmxremote.authenticate=false
        -Dcom.sun.management.jmxremote.ssl=false 
        -jar Application.jar

JTop is connected by passing it the hostname and port number so that it can connect to the JMX agent:

   java -jar %JDK_HOME%\demo\management\JTop\JTop.jar localhost:7000

JTop uses a TimerTask to refresh the thread list every 2 seconds. It gets the CPU time for each thread and then sorts the list by CPU time. You should see something like this:

It is primitive but it serves as a useful demonstration of the APIs. The source is in the src directory and with a little bit of effort it could be turned into a nice tool. For example, it currently just sorts the threads by total CPU usage and it might be nicer to highlight the threads that were busy in the proceeding interval. Also, it might be interesting to sample the stack traces of the busy threads to get an idea of the code that is executing.


The second example is heapViewer. You'll find it in %JDK_HOME%\demo\jvmti\heapViewer (or $JDK_HOME/demo/jvmti/heapViewer). This example has been included since JDK5.0. It's a native agent that uses the JVM Tool Interface to do some basic memory analysis. It uses IterateOverHeap to do a linear iteration over all objects in the heap. It then prints a summary in the form of a class-wise histogram to show which objects are taking up space in the heap.

The agent is an in-process agent that is started using the -agentlib or -agentpath command-line options. Here's one example:

java -agentpath:%JDK_HOME%\demo\jvmti\heapViewer\lib\heapViewer.dll
     -jar Application.jar

When the VM exits or the agent gets a DataDumpRequest event then it will print a histogram to summarize the objects in the heap. A DataDumpRequest is triggered by a Ctrl-Break on Windows, or Ctrl-\ (or SIGQUIT) on Solaris/Linux. Here is some sample output:

Heap View, Total of 123199 objects found.

Space      Count      Class Signature
---------- ---------- ----------------------
  19597872       7421 [I
   1420264      13037 [C
   1217872       2681 [B
   1194256       2381 Ljava/lang/Class;
    985624       3619 [S
    266688      11112 Ljava/lang/String;
    173576       4682 [Ljava/lang/Object;
    164800        824 Lsun/java2d/SunGraphics2D;
    123888       5162 Ljava/util/Hashtable$Entry;
    106560       4440 Ljava/awt/Rectangle;
     79552       2486 Ljavax/swing/text/html/parser/ContentModel;
     70816        818 [Ljava/util/Hashtable$Entry;
     67440        843 Ljavax/swing/text/html/InlineView;
     63504       2646 Ljavax/swing/SizeRequirements;
     57984        906 Ljava/awt/geom/AffineTransform;
     56600       1415 Ljava/util/WeakHashMap$Entry;
     55560       2315 Ljava/util/HashMap$Entry;
     52864        112 Ljavax/swing/plaf/metal/MetalScrollButton;
     46880       2930 Ljavax/swing/event/EventListenerList;
     45056        352 Ljavax/swing/text/html/LineView;
     44808       1867 Ljava/awt/Insets;
---------- ---------- ----------------------

In this example, there were 123199 objects in the heap, and integer arrays are taking up the most space. There are of course other ways to generate a heap histogram like this but this one is useful as it makes use of the standard APIs. The source for this example is in the src directory.


I hope you agree these are useful examples. There are a number of others in the demo/management and demo/jvmti directories which are worth checking out too.

http://blogs.sun.com/alanb/date/20051103 Thursday November 03, 2005

To poll or epoll: that is the question:

One of the updates in build 59 of Mustang (JavaTM SE 6) is that the New I/O Selector implementation will use the epoll event notification facility when running on the Linux 2.6 kernel. The epoll event mechanism is much more scalable than the traditional poll when there are thousands of file descriptors in the interest set. The work done by poll depends on the size of the interest set whereas with epoll (like Solaris /dev/poll) the registration of interest is separated from the retrieval of the events. A lot has been written on the topic. The C10K problem has been documenting I/O frameworks and strategies for several years. One relatively recent paper on Comparing and Evaluating epoll, select, and poll Event Mechanisms makes it clear the workloads where epoll performs a lot better than poll.

This isn't the first NIO Selector implementation to use epoll. The Blackdown folks added epoll support in their 1.4.x release. On Solaris, the /dev/poll based Selector has been default on Solaris 8 (and newer) since the original implementation of New I/O in J2SETM 1.4.

So if you are running on a Linux 2.6 system with an application that handles lots of simultaneous connections you might want to give b59 a test-run. The weekly builds have been appearing like clockwork on the binary snapshot release site so b59 should be available tomorrow (November 4). Will you see a difference? It depends on the workload. If you've registered lots of SelectableChannels with a Selector and you notice a lot of time spent in the kernel due to poll then you should see a difference. If you are doing test runs and you want to do a direct comparison with poll then you can set the java.nio.channels.spi.SelectorProvider system property to sun.nio.ch.PollSelectorProvider. This will select the poll-based Selector that will continue to be the default on 2.4 kernels. There is an epoll patch for 2.4 kernels but at this time anyway, the NIO implementation doesn't attempt to detect this.


[Update 08/04/2006: The epoll SelectorProvider will be included in 5.0 update 9. To enable it requires setting the system property java.nio.channels.spi.SelectorProvider to the value sun.nio.ch.EPollSelectorProvider. Sorry, it won't be the default in 5.0 but we must be very careful when changing anything in an update release.]


Valid HTML! Valid CSS!

This is a personal weblog, I do not speak for my employer.