Wednesday November 10, 2004
Performance of JDK (1.4) String.intern()...
One of the things I have always wondered is how efficient String.intern() is (or is not). Since it's a native method, there has to be some overhead due to JNI. Also, there are some vague mentions in JDK documents indicating that intern()ing is not meant for large number of Strings. Nonetheless, since JDK itself has to rely quite a bit on intern() (since all String consts for classes are automatically intern()ed when classes are loaded?), one might think implementation should be fairly efficient.
Now, more important than the question of absolute speed is whether the simple all-Java replacement might actually be faster? A trivial thread-safe implementation of an Interning class might be as such (and is the one I tested against straight String.intern()):
final static class Interner
{
final static HashMap mMap = new HashMap();
public static String intern(String str) {
HashMap m = mMap;
synchronized (m) {
String interned = (String) m.get(str);
if (interned == null) {
interned = str;
m.put(interned, str);
}
return interned;
}
}
}
As it turns out (based on a simple micro-benchmark), the non-native version is about 3x as fast as calling String.intern()! (on Sparc; on AMD it's even more: 6x speed increase). This seems to be consistent across wide variety of test data sizes (from 256 Strings to 32000); ratio appears about the same. These results are from JDK 1.4.2; I will need to check out what JDK 1.5.0 gives.
So why or where does this matter? Surely the class loading aspect is trivial overhead?
Class loading is mostly irrelevant, indeed; but there are cases where heavy use of intern()ing can improve performance, as String comparisons can be done using cheap identity comparison, instead of slower String.equals() calls. And since JDK's in-built classes such as java.util.HashMap (and java.lang.String itself) check for equality comparison first (since it often leads to match) the speedup can be achieved fairly easily.
Now, obviously such optimization can only lead to modest overall improvements... but applying the technique is rather easy. Also, maybe JavaSoft could consider just implementing intern() functionality as pure Java for next JDK?
November 10, 2004 03:20 PM PST Permalink