Frank Kieviet
How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)
In the previous blog entry Classloader leaks: the dreaded "java.lang.OutOfMemoryError: PermGen space" exception I explained how this type of problem can originate in the application code that you deploy to an application server. In this post I'll explain how to track down the leak so that you can fix it.
Profilers
Memory leak? Use a profiler. Right? Well... generally speaking the answer is yes, but classloader leaks are a bit special...
To refresh your memory (pardon the pun), a memory leak is an object that the system unintentionally hangs on to, thereby making it impossible for the garbage collector to remove this object. The way that profilers find memory leaks is to trace references to a leaked object.
What do I mean by "tracing"? A leaked object can be referenced by another object which itself is a leak. In turn, this object may also be a leak, and so on. This repeats until an object is found that references a leaked object by mistake. This reference is where the problem is, and what you need to fix. Let me try to clarify this by illustrating this with a picture from my previous blog:
In this picture the AppClassloader, LeakServlet.class, STATICNAME, CUSTOMLEVEL, LeakServlet$1.class are all leaked objects. Due to static objects (e.g. STATICNAME) in the picture, that may in turn reference other objects, the number of leaked objects may be in the thousands. Going over each leaked object manually to check if there are any incidental references to it (the red reference in the picture) until you find the troublesome object (CUSTOMLEVEL) is laborious. You would rather have a program find the violating reference for you.
A profiler doesn't tell you which leaked object is interesting to look at (CUSTOMLEVEL). Instead it gives you all leaked objects. Let's say that you would look at STATICNAME. The profiler now should find the route STATICNAME to LEAKSERVLET.class to AppClassloader to LeakServlet1$1.class, to CUSTOMLEVEL to Level.class. In this route, the red line in the picture is the reference that actually causes the leak. I said the profiler should find this route. However, all the profilers that we tried, stop tracing as soon as they reach a class object or classloader. There's a good reason for that: the number of traces grows enormous if it follows through the references through classes. And in most cases, these traces are not very useful.
So no luck with profilers! We need to try something else.
JDK 6.0 to the rescue
When Edward Chou and I worked on tracking down classloader leaks last year, we tried to run the JVM with HPROF and tried to trigger a memory dump; we looked at using Hat to interpret the dump. Hat stands for Heap Analysis Tool, and was developed to read dump files generated with HPROF. Unfortunately, the hat tool blew up reading our dump files. Because we didn't think it was difficult to parse the dump file, we wrote a utility to read the file and track the memory leak.
That was last year. This year we have JDK 6.0; this new JDK comes with a few tools that make looking at the VM's memory a lot simpler. First of all, there's a tool called jmap. This command line tool allows you to trigger a dump file without HPROF. It is as simple as typing something like:
jmap -dump:format=b,file=leak 3144
Here leak is the filename of the dump, and 3144 is the PID of the process. To find the PID, you can use jps.
Secondly, Hat is now part of the JDK. It is now called jhat. You can run it using a command line like:
jhat -J-Xmx512m leak
Here leak is the name of the dump file, and as you may have guessed, -J-Xmx512m is a parameter to specify how much memory jhat is allowed to allocate.
When you start jhat it reads the dump file and then listens on an HTTP port. You point your browser to that port (7000 by default) and through that you can browse the memory heap dump. It's a very convenient way of looking at what objects are in memory and how they are connected.
So, it seemed like a good idea to check out what can be done with these new tools to find classloader leaks.
... or not?
Unfortunately, jhat, just like the profilers we tried, also stops tracing when it encounters a class. Now what? I decided to download the JDK source code and find out what the problem is. Building the whole JDK is a difficult task from what I gather from the documentation. Fortunately, jhat is a nicely modularized program; I could just take the com.sun.tools.hat-packages out of the source tree, load them in my favorite editor and compile the code. The patched code was easily packaged and run: I just jar-ed it and added it to the lib/ext directory of the JDK:
jar -cf C:\apps\Java\jdk1.6.0\jre\lib\ext\ahat.jar -C hat\bin .
jhat leak
This was really as easy as pie. So after running the program in the debugger for some time, I figured out how it works and what changes I wanted to make. The change is that when you follow the references from a classloader, the modified jhat will follow through all traces from all the instances of the classes that it loaded. With that change, finding the cause of a classloader leak is simple.
An example
Let's look at the example from my previous blog as depicted in the picture above. Using NetBeans I created the following servlet and deployed it to Glassfish:
1 package com.stc.test; 2 3 import java.io.*; 4 import java.net.*; 5 import java.util.logging.Level; 6 import java.util.logging.Logger; 7 import javax.servlet.*; 8 import javax.servlet.http.*; 9 10 public class Leak extends HttpServlet { 11 12 protected void processRequest(HttpServletRequest request, HttpServletResponse response) 13 throws ServletException, IOException { 14 response.setContentType("text/html;charset=UTF-8"); 15 PrintWriter out = response.getWriter(); 16 out.println("<html><body><pre>"); 17 Level custom = new Level("LEAK", 950) {}; 18 Logger.getLogger(this.getClass().getName()).log(custom, "New level created"); 19 out.println("</pre></body></html>"); 20 out.close(); 21 } 22+ HTTPServlet methods. Click on the + sign on the left to edit the code 48 } 49
I invoked the servlet to cause the leak. Next I undeployed the servlet. Then I triggered a heap dump:
jmap -dump:format=b,file=leak 3144
and fired up the modified jhat:
jhat -J-Xmx512m leak
and brought up the browser. The opening screen shows amongst other things, all classes that are found in the dump:
Finding objects that were leaked is easy since I know that I shouldn't see any objects of the classes that I deployed. Recall that I deployed a class com.stc.test.Leak; so I searched in the browser for the com.stc.test package, and found these classes (never mind the NoLeak class: I used it for testing).
Clicking on the link class com.stc.test.Leak brings up the following screen:
Clicking on the classloader link brings up the following screen:
Scrolling down, I see Reference Chains from Rootset / Exclude weak refs . Clicking on this link invokes the code that I modified; the following screen comes up:
And there's the link to java.util.Logging.Level that we've been looking for!
Easy as pie!
Summarizing, the steps are:
- undeploy the application that is leaking
- trigger a memory dump
- run jhat (with modification)
- find a leaked class
- locate the classloader
- find the "Reference chains from root set"
- inspect the chains, locate the accidental reference, and fix the code
I'll contact the JDK team to see if they are willing to accept the changes I made to jhat. If you cannot wait, send me an email or leave a comment.
Update (April 2007): Java SE SDK 6.0 update 1 has the updated code.
Other Permgen space tidbits
After fixing the classloader leak, you of course want to test to see if the memory leak has disappeared. You could again trigger a memory dump and run jhat. What you also could try is to see if the amount of used permgen space memory goes up continuously after each deployment/undeployment of your application.
You can monitor permgen space usage using jconsole. You can see the memory usage go up when you repeatedly deploy and undeploy an application. However, this may not be a classloader / memory leak. As it turns out, it's difficult to predict when the garbage collector cleans up permgen space. Pressing the button in Run GC in jconsole does not do the trick. Only when you encounter a java.lang.OutOfMemoryError: PermGen space exception can you be sure that there really was no memory. This is a bit more involved than it should be!
How can we force the garbage collector to kick in? We can force a java.lang.OutOfMemoryError: PermGen space and then releasing the memory after which we force the garbage collector to kick in. I wrote the following servlet to do that:
package com.stc.test; import java.io.*; import java.util.ArrayList; import javax.servlet.*; import javax.servlet.http.*; public class RunGC extends HttpServlet { private static class XClassloader extends ClassLoader { private byte[] data; private int len; public XClassloader(byte[] data, int len) { super(RunGC.class.getClassLoader()); this.data = data; this.len = len; } public Class findClass(String name) { return defineClass(name, data, 0, len); } } protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); out.println("<html><body><pre>"); try { // Load class data byte[] buf = new byte[1000000]; InputStream inp = this.getClass().getClassLoader() .getResourceAsStream("com/stc/test/BigFatClass.class"); int n = inp.read(buf); inp.close(); out.println(n + " bytes read of class data"); // Exhaust permgen ArrayList keep = new ArrayList(); int nLoadedAtError = 0; try { for (int i = 0; i < Integer.MAX_VALUE; i++) { XClassloader loader = new XClassloader(buf, n); Class c = loader.findClass("com.stc.test.BigFatClass"); keep.add(c); } } catch (Error e) { nLoadedAtError = keep.size(); } // Release memory keep = null; out.println("Error at " + nLoadedAtError); // Load one more; this should trigger GC XClassloader loader = new XClassloader(buf, n); Class c = loader.findClass("com.stc.test.BigFatClass"); out.println("Loaded one more"); } catch (Exception e) { e.printStackTrace(out); } out.println("</pre></body></html>"); out.close(); }In this servlet a custom classloader is instantiated which loads a class in that classloader. That class is really present in the web classloader, but the custom classloader is tricked by not delegating to the parent classloader; instead the classloader is instantiating the class using the bytes of the class obtained through getResourceAsStream().
In the servlet it tries to allocate as many of these custom classes as possible, i.e. until the memory exception occurs. Next, the memory is made eligible for garbage collection, and one more classloader is allocated thereby forcing garbage collection.
The number of custom classes that can be loaded until a memory exception occurs, is a good measure of how much permgen space memory is available. As it turns out, this metric is a much more reliable than the one that you get from jconsole.
And more
Edward Chou is thinking of some other ideas to further automate the process of determining exactly where the cause of a classloader leak is. E.g. it should be possible to identifiy the erroneous reference (the red line in the picture) automatically, since this reference is from one classloader to another. Check his blog in the coming days.
Update (April 2007): You can find an interesting usage of jhat's Object Query Language on Sundarajan's blog to compute histograms of reference chains.
Posted at
04:10PM Oct 19, 2006
by Frank Kieviet in Sun |
Comments[37]
Permalink: http://blogs.sun.com/fkieviet/entry/how_to_fix_the_dreaded
Thursday Oct 19, 2006

Posted by Matthias on October 20, 2006 at 06:30 AM PDT #
Posted by Kelly O'Hair on October 20, 2006 at 09:40 AM PDT #
Posted by Frank Kieviet on October 20, 2006 at 11:47 AM PDT #
Posted by Nick Stephen's blog on October 31, 2006 at 02:45 AM PST #
Posted by Ortwin Escher on November 03, 2006 at 05:10 AM PST #
Posted by Mickael on November 14, 2006 at 08:02 AM PST #
Hi Mickael,
Can you send me an email (frank dot kieviet at sun dot com)? I don't know your email address. The email you may have typed in the comments box is invisible to me: it's only known to the system so that it can send you an update if this thread is updated.
Frank
Posted by Frank Kieviet on November 14, 2006 at 09:48 AM PST #
Posted by Frank Kieviet on November 15, 2006 at 10:31 PM PST #
Posted by Sebastien Chausson on November 17, 2006 at 03:38 AM PST #
Hi Sebastien,
In the case of the problem with the <tt>Level</tt>, you could change your application code so that it does not use a new <tt>Level</tt> subclass. It's a workaround for a problem in code that you have no control over (i.e. the <tt>Level</tt> class), and as such it fixes your problem.
That's a common approach: often you cannot fix the problem properly (e.g. no control over the code, proper fix is too laborious/expensive) so you have to find a workaround.
Frank
Posted by Frank Kieviet on November 17, 2006 at 04:21 PM PST #
Posted by Matthias on December 02, 2006 at 02:54 AM PST #
> The fix has made it into JDK 7b3.
Thanks! Excellent! No more need for a patched jhat!
Frank
Posted by Frank Kieviet on December 04, 2006 at 10:10 AM PST #
Detlef
Posted by Detlef Kraska on December 14, 2006 at 04:01 AM PST #
Hi Detlef,
Working with EJBs or Servlets shouldn't make a difference. Did you contact our support department about this? I'm interested in what references you found. Can you send me an email? (frank dot kieviet at sun dot com).
Frank
Posted by Frank Kieviet on December 14, 2006 at 04:34 PM PST #
Posted by Chris James on May 24, 2007 at 07:15 AM PDT #
Frank,
I am trying to debug PermGen space OOM error and to get memory dump I added -Xrunhprof:heap=all,format=b,depth=4,file=data18
on my weblogic java options and later undeployed my application and triggered dump using ctrl+break on my weblogic console (on windows XP), weblogic is terminating the dump by throwing ClassNotFoundException, I increased the depth to 70 now I got the big dump file but Jhat on java1.6 is not showing me the classloader info. I am getting following on prompt
Snapshot read, resolving...
Resolving 0 objects...
WARNING: hprof file does not include java.lang.Class!
WARNING: hprof file does not include java.lang.String!
WARNING: hprof file does not include java.lang.ClassLoader!
Chasing references, expect 0 dots
Appreciate if you could point out the missing step.
Thanks
Kumar
Posted by Kumar Kartikeya on September 18, 2007 at 06:07 PM PDT #
Re Kumar:
I don't think you need to specify heap=all; I think the stacks are not dumped if you choose heap=dump which will make the dump a lot smaller.
On which VM are you running? I have only tested the Sun VM so far. If you're running on a different one, perhaps hprof doesn't work as advertised on that VM.
You could also try to specify to dump when an OOM occurs ( -XX:+HeapDumpOnOutOfMemoryError) in which case you don't need to specify hprof on the command line.
Lastly, did you try to use jmap?
Frank
Posted by Frank Kieviet on September 24, 2007 at 11:20 PM PDT #
Frank,
Thanks for the response, I am using Java 5, so could not use -XX:+HeapDumpOnOutOfMemoryError or jmap directly also looks like weblogic does not support Java 6, so I ported my application on Java 6 and used tomcat to produce dump, and after running jhat, I found that two of my classes (generated classes using wsdl2java) are held by classloader that also loads Apache Axis classes, particularly XMLUtils.java, I looked at source code and found that It is using ThreadLocal and looks like reference in thread local is leaking, here I might be wrong.
I am using axis 1.4 but Apache Axis2 also has same issue where if you redeploy the any sample application on any server (tomcat or weblogic) perm gen space keep on increasing.
Are you or anyone aware of this issue with Apache Axis?
Thanks for your time.
Kumar
Posted by Kumar Kartikeya on September 27, 2007 at 11:00 AM PDT #
The "exhaust PermGen space to force GC" technique works very nicely - except if you have the hprof agent loaded, which seems to be stopping classes from unloading.
Posted by Max Bowsher on October 27, 2007 at 05:25 PM PDT #
Are you available to help identify my memory leaks on my webserver?
Posted by Del Rundell on November 06, 2007 at 09:53 AM PST #
Hi, I think this article is very interesting. I'm working with Oracle App Server 10.1.3.1.0 with Solaris SPARC 10 and this is not certified by working with JDK6 I would apreciate so much if you could send me the modified jhat version. Anyway, I'm trying to download JDK6 for solaris 10 (SPARC). Im thinking by replacing jhat included in jdk6 in my current jdk5 installation, I think the probability of failures is so high. If you can help me I'd really apreciate it. I've been working in this issue by 2 or 3 weeks and this would help me so much.
Thank's.
Posted by Hugo Martinez on November 14, 2007 at 11:22 AM PST #
Thanks so much for putting this together, it helped me identify two problems with my app, and one with a third party (Axis). Unfortunately no fix for the Axis issue, but at least there's a bug for it in that project. Problem is there for Axis 1.4 and 2.
My app issues were threads that run in infinite loops, that were not interrupted when the app undeployed. Still need to find how to setup a listener to determine when the app is being undeployed and interrupt those threads.
Posted by Alex Quezada on December 05, 2007 at 11:22 AM PST #
I am interested to know how to fix the problem in Leak class. Thanks!
Posted by bob on March 11, 2008 at 04:47 AM PDT #
This is a good article. But what really is the overall solution?? Without the existance of this article or just the average java developer pursuing through the JDK source, how in the heck would anyone know about this Level class issue?
Secondly what other hidden gems like this exist out there in the JDK or the multitude of other java libraries out there? auughh!!
Posted by borfnorton on March 18, 2008 at 05:43 PM PDT #
I read both of your postings and I am still having trouble figuring out how to find the perm gen leak and how to fix it. My web application has many classes left in memory after the undeploy. This includes many t3rd party jars. When I go into the dump using jhat, I follow your instructions. One of the last things you say is "And there's the link to java.util.Logging.Level that we've been looking for!"
What if I don't know what to look for?
The last step in your process is: "7. inspect the chains, locate the accidental reference, and fix the code"
How do I know which reference is the accidental reference?
Posted by Richard on March 27, 2008 at 05:59 AM PDT #
Re Richard:
The idea is that there should be no links whatsoever from any of the undeployed classes to a root object. So, when looking at the from one of your application classes, there are still references after undeployment, these are leaks.
In the case of the Level class, I wasn't looking for the Level class, but for any remaining references. The Level class was holding a reference and hence was a leak.
To figure out which one is the accidental reference will require some work and insight into the code. For each of the links in the reference chain you would have to look at the source code and try to judge if that reference constitutes a memory leak.
Frank
Posted by Frank Kieviet on March 27, 2008 at 03:38 PM PDT #
I too am looking into web app memory leaks and the use of Enums.
using jmap and jhat, I am seeing my servlet class still in permgen space, after I had undeployed it. However, no rootset reference appears to point outside of my webapp class loader, unlike the mentioned example with the Logger.
maybe i am using jmap/jhat incorrectly?
Apologies for the vagueness of the post, I am trying to figure out if i really do have a leak or not.
Posted by Stuart Maclean on April 01, 2008 at 01:05 PM PDT #
Just to follow up,
I have a simple servlet class S, loaded on startup of the webapp. An enum E is declared in S, and used in S.init.
I am perplexed as why I see mention of java.lang.reflect.Field and org.apache.catalina.loader.ResourceEntry
in the list of 'references to this object' for the class object for E.
Is the use of enum somehow requiring some reflection? I have looked at the 1.6 source of Enum.java and don't see any static list storage issues like that of the logging example.
Stu
Posted by Stuart Maclean on April 01, 2008 at 01:16 PM PDT #
Great paper ! I really enjoyed figuring out these tricky points. But btw, in the 7th point of your summary, you explained "inspect the chains, locate the accidental reference, and fix the code". How the hell could we fix such a problem, as it seems that the "non freed" reference comes from outside of 'our' code ? Of course we could start changing the Level implementation, but I feel uncomfortable with that ;) ? Thanks for any useful information
Posted by Dual Action Cleanse on April 26, 2008 at 11:05 PM PDT #
Re Stu:
I haven't looked into this particular problem with Tomcat myself, but we did run into an issue where there is a bean utility in Apache that caches the accessors of Java beans. Since this cache is in the system classloader, it's a source of classloader leaks. Fortunately there's something like a flush() method on this class, so you could potentially try to call this after undeploying.
I'm not aware of anything special about enums.
Frank
Posted by Frank Kieviet on May 02, 2008 at 09:47 AM PDT #
Re Dual Action Cleanse:
Indeed, if you don't own the code you do have a problem. You could raise the issue with the owner of the code, try to fix it yourself, or try to find a workaround.
In case of the Level class, Sun JDK team is aware of the problem and it will hopefully be fixed in a future release; until then I'd simply recommend not using custom log levels.
Frank
Posted by Frank Kieviet on May 02, 2008 at 09:51 AM PDT #
Thanks for your blog. IMHO I found an issue in the JSF Reference Implementation following your instructions:
https://javaserverfaces.dev.java.net/issues/show_bug.cgi?id=742
Unfortunatly this is not the only class leak we experience, have to hunt down the other ones.
Posted by Olaf Flebbe on June 11, 2008 at 06:27 AM PDT #
Frank,thanks for your insights. but this memory leak maybe happen not beacuse of AppClassLoader. LeakServlet1&1's classloader is not AppClassLoader(webApp classload),it is the AppClassLoader's parent. So LeakServlet1&1(class) not refer to AppClassLoader. it means AppClassLoader can be GCed. I has investigated this on Tomcat 6.0.13(java1.5) AppClassLoader is GCed
Posted by fangsoft on June 12, 2008 at 01:32 AM PDT #
Re fangsoft:
I'm confused with what your statement that the servlet's classloader is different from the web classloader. How can that be?
Frank
Posted by Frank Kieviet on June 12, 2008 at 02:33 PM PDT #
frank, servlet's classloader is the same as the web classloader. but LeakServlet1&1's classloader is not web classloader but it's parent according to web specification. So LeakServlet1&1 maybe not reference web classloader.
Posted by fangsoft on June 13, 2008 at 06:05 PM PDT #
frank, i think the application server used by your demo is implemented poorly. It can prevent AppClassLoader from referencing LeakServlet.class.
Please refer to org.apache.catalina.loader.WebappClassLoader in Tomcat6.0.14. It has a stop method to release loaded classes.It is a robust webapp's classloader implementation.
Posted by fangsoft on June 16, 2008 at 02:35 AM PDT #
Hi there! Really nice post and quite helpful. However, we're using Axis2 and a lot of references are shown. I think there's a general problem with Axis(1 & 2) as proved by Kumar and Alex.
Hope to see improvement really soon! :)
Posted by Alejandro Andres on June 24, 2008 at 05:00 AM PDT #