method handles in a nutshell
1. Direct method handles
Given any method M that I am able to invoke, the JVM provides me a way to produce a method handle H(M). I can use this handle later on, even after forgetting the name of M, to call M as often as I want. Moreover, if I provide this handle to other callers, they also can invoke M through the handle, even if they do not have access rights to call M by name. If the method is non-static, the method handle always takes the receiver as its first argument. If the method is virtual or interface, the method handle performs the dispatch on the receiver.
A method handle will confess its type reflectively, as a series of
Class values, through the type operation.
In pseudo-code:
MHD h1 = H(Object.equals); MHD h2 = H(System.identityHashCode); MHD h3 = Hs(String.hashCode); assert h1.type() == SIG[(Object,Object)boolean]; assert h1.invoke(r1,a1) == r1.equals(a1); assert h2.invoke(a2) == System.identityHashCode(a2); assert h3.invoke(r3) == r3.invokespecial:String.hashCode();The actual name of the type
MHD will be given shortly.
The actual API for H and Hs is uninterestingly straightforward, and
may be found at the end with the other details.
To complete the low-level access (and fill a gap in the Core
Reflection API), there is a variation Hs(M) which forces static
linkage just like an invokespecial instruction, and is
allowed only if I have the right to issue an
invokespecial instruction on M.
From the JVM implementor’s point of view, there are probably three or four distinct subclasses of direct method handle, corresponding to the distinct varieties of invoke instruction. To round things out, one kind of method handle should work for invoking a method handle itself. These are low-level concerns, which hide nicely behind the H (and Hs) operator described above.
2. Invoking method handles
Given a method handle H, I can invoke it by issuing aninvokeinterface bytecode against it. The signature I use
must exactly match the original signature of the target method. (Even
beyond the spelling, the linked meaning of class names must be the
same, in the argument and return types.) The method name I use must
always be invoke (not the name of the target method).
In pseudo-code:
MHI h1 = ...; h1.invoke(a1...)The type
MHI is special interface type known to the JVM.
(Its actual name will be given shortly.)
MHI functions as a marker interface to tell the JVM that this
occurrence of the invokeinterface bytecode must be treated
specially, different from all other interface invocations. For one
thing, normal JVM linking rules cannot apply, because the signature of
the call site relates to the target method, not to the marker
interface.
This kind of call site works on direct method handles (type MHD)
created in part 1 above. In a moment we will drop the other shoe
and observe that it works on other types of method handles.
The invokeinterface instruction is uniquely suited for
this sort of JVM extension, because the result for bytecode
verification allow any object to serve as the receiver of an interface
invocation.
3. Adapting method handles
The type MHI provides a very flexible jumping off point, for the bytecodes of one method to call any other method, of any given signature. The next question is whether the calling method and receiving method have to agree exactly on the signature, and the answer is “no”. This brings us to the third and final major design point, of adapting method calling sequences.The most important case of adaptation is partial invocation (sometimes known as currying or binding). A direct method handle by itself is really quite boring because, unlike nearly everything else in an object-oriented system, it is pure code, with no data to modify its meaning.
Thus, given a method handle and some arguments for it, the JVM will give me a partial invocation of that method handle, which is the new method handle that remembers those arguments, and, when invoked on the remaining arguments, will invoke the original method handle with the grand total set of arguments.
At the very least, the JVM is willing to let me specify the first argument R of a virtual or interface method handle H(M), because that lets it perform method dispatch when the handle is created, and hand me back a method handle Adapt(H(M),R) that not only remembers the argument R, but has also pre-resolved the method dispatch R.M. This special case of partial invocation, sometimes called “bound method references”, is enough of a hook to let programmers introduce the usual object-oriented flexibilities into method handles.
In pseudo-code:
MHD h1 = H(Object.equals); // SIG[(Object,Object)boolean] MHB h2 = Bind(h1, (Object)"foo"); assert h2.type() == SIG[(Object)boolean]; assert h2.invoke(a2) == "foo".equals(a2);
The type MHB stands for a bound method reference. (Please wait a moment for its actual spelling.)
3.5 Further adaptation
As long as we are messing with arguments, there is a fairly unsurprising range of other adaptations that arise naturally from the richness of JVM signatures, and the conversions that apply between various data types. (The details of varargs and reflective invocation also bear on this design.)Specifically, given two method signatures (A)T and (A')T', and a method handle H(M) of type (A)T, there is a library routine which will create me a new method handle H' = Adapt(H(M), (A')T). It is my responsibility to help the library routine match up the corresponding arguments of the two signatures, to direct it to drop unneeded arguments in A', to supply preset values for arguments in A missing in A' (this is where partial invocation comes into the general picture), and to tell it of the presence of varargs in either signature. The library is happy to insert casts, primitive conversions, and boxing (or unboxing) to make the arguments match up completely.
Here are some pseudo-code examples:
MHD h1 = H(String.concat); // SIG[(String,String)String]
MHA h2 = Adapt(h1, SIG[(String,String)String], $1, $0);
MHA h3 = Adapt(h1, SIG[(String)String], $0, $0);
MHA h4 = Adapt(h1, SIG[(String)String], $0, ".java");
assert h2.invoke(a,b) == b.concat(a);
assert h3.invoke(c) == c.concat(c);
assert h4.invoke(c) == c.concat(".java");
That is a longish step beyond bound method references, but I believe the sweet spot of the design will supply a flexible set of method signature adaptations (including currying), and let JVM implementors choose how much of that the JVM wants to take responsibility for.
At a minimum, bound method references must be special-cased by the JVM, but everything else could be supplied by a Java library (one which is willing to dynamically code-generate many of its adapter methods).
At a maximum, the JVM could supply a Swiss Army Knife combinator which interpretively handles all possible argument wrangling. This is probably the right way to go for HotSpot, since the HotSpot JIT is as well suited for optimizing complex adapters as simple ones, and having the complex ones appear to the compiler as single intrinsics is no big deal.
Breaking the suspense: And the name of the winner is...
So we have four different types floating around:- MHD - a direct handle to a user-requested method (either virtual or static)
- MHI - the magic type which warns the JVM of a method handle call site
- MHB - a bound method handle, which remembers the method receiver
- MHA - a more complex adapted method handle
java.dyn.MethodHandle.
Clearly there will be other types under the covers, such as the
concrete types chosen by the JVM for specific direct method handles
(MHD), or various implementation classes of adapted methods (MHB,
MHA). But there is no reason to distinguish them to the user.
However, one specific case of bound method handles is important to
consider from the user’s viewpoint. If a receiver object R has
a public method (in a public API type) already named
invoke, with a signature of (S)T, then R is already
looking very much like a bound method handle for its own
invoke method, with signature (S)T.
For completeness of exposition, let’ll give this kind of non-primitive method handle its own informal type name:
- MHJ - a Java object that implements
MethodHandleand a type-consistentinvokeoperation
So, at the risk of adding a chore to the JVM implementor’s list,
I think an object of such a type (MHJ) should serve (uniformly in the
contexts described above) as a method handle. (It is may be necessary
to ask that R implement the marker interface and the
type method; but is something the system could also
figure out well enough on its own.) I admit that this is not a
necessary feature, but it could cut in half the number of small
method-like objects running around in some systems.
And the MHA implementation above probably requires an MHJ anyway.
Background: How did we get here?
One of the biggest puzzles for dynamic language implementors on the JVM, and therefore for the JSR 292 (invokedynamic) Expert Group, is how to represent bits of code as small but composible units of behavior. The JVM makes it easy to compose objects according to fixed APIs, but it is surprisingly hard to do this from the back end of a compiler, when (potentially) each call site is a little different from its neighbors, and none of them match some fixed API. The missing link is an object which will represent a chunk of callable behavior, but will not require an early commitment to a fixed calling sequence. In theory-language, we want an object whose API is polymorphic over all possible method signatures, so the compiler (and runtime call site linker, in turn) can manage calls in a common framework, not one framework per signature.
Put another way, we cannot represent all callees as
Runnable or Callable, because fixed
interfaces like those serve just a subset of all interesting call
signatures. APIs which attempt to represent all possible calls,
notably Java’s Core Reflection API, simulate all signatures by
boxing arguments into an array, but this is a simulation (with
telltale overheads) rather than a native JVM realization of each
signature.
We know signature polymorphism is powerful, from our experience with many dynamic and functional languages. (For an old example, consider the Lisp APPLY function, which is an efficient but universal call generator.) Integrating such polymorphism into the Java language is challenging; that’s why the function types in Neal Gafter’s closures proposal are a significant portion of the specification.
Happily, it is a simpler matter to integrate signature polymorphism into the JVM. As part of the JSR 292 process, I have been worrying about this for some time. The result is the present story of method handles which (a) JVMs can implement efficiently, which (b) are useful to language backends, and which (c) have a workable Java API. That last is actually the hardest, which is why I have not given it yet. (See previous paragraph.)
Before giving the API, I want to emphasize a few more points. First,
method handles (per se) are completely stateless and opaque. They
self-report their signature (S)T (via a type operation
on MethodHandle) but they reveal nothing else about their
target. They do not perform any of the symbol table queries supplied
by the Core Reflection API.
Every native call site for a method handle is hardwired with a particular signature. Compiler writers have every right to expect that, if the target method has a similar signature, the call will have only a few instructions of overhead. Likewise, a method handle’s signature is intrinsic to the handle, and completely rigid. Calls to near-miss signatures will fail, as will violations of class loader naming consistency.
Besides signature simulation, one serious overhead in the Core Reflection API is the requirement that, on every call to a reflected method, the JVM look at the caller’s identity and perform an access check to make sure that he is not calling someone else’s private method. The method handle design respects all such access checks, but performs them up front at handle creation, where (presumably) they are more affordable. But you can publish a handle to your own private method, if you choose.
One use case (which I have used to test the quality of this design) is
whether it can be used to re-implement the invoke
functionality in the Core Reflection API, for better speed and code
compactness. This has long been a sore spot for language implementors
(for reasons detailed above). This one reason I have included varargs
in the competency of the method adaptation API.
The calling sequence for a method handle (in part 2 above) will be
approximately as fast as today’s interface invocations.
Searching for an invoke method in a receiver is the same
sort of task as searching for an interface (and its associated
“vtable”, if you use such things). The search can be sped
up by the usual sorts of pre-indexing. A JVM-managed method handle
will advertise its signature prominently in its header, so that a
pointer equality check (remember, signature agreement is exact) is all
that needs to happen before the caller jumps through a hardware-level
function address.
Details and a hasty exit
Finally, here is a sketch of the API:package java.dyn; public interface MethodHandle /*>*/ { // T type(); public R invoke(A...); public MethodType type(); } public interface MethodType { public Class> parameterType(int num); // -1 => return type public int parameterCount(); } public class MethodHandles { public static MethodHandle findStatic(Class> defc, String name, MethodType type); public static MethodHandle findVirtual(Class> defc, String name, MethodType type); public static MethodHandle findSpecial(Class> defc, String name, MethodType type); public static MethodHandle unreflect(java.lang.reflect.Method m); public static MethodHandle convertArguments(MethodHandle mh, MethodType newType); public static MethodHandle insertArgument(MethodHandle mh, Object value); ... // The whole enchilada: public static MethodHandle adaptArguments(MethodHandle mh, MethodType newType, String argumentMovements, Object values); }
That’s it, in a nutshell. Perhaps rather large coconut shell. Actually, quite small, if you are used to Unix shells.
You will have noticed that there is no way to call these guys from
Java code, unless you assemble yourself a class file around the
required invokeinterface. It is simple enough to create
a Java API for calling method handles. Getting performance beyond the
reflective boxed-varargs style of calling is a little messier, but
doable. Dynamic language implementors solve this sort of thing as
they fight to remove simulation overheads from their system. Given
closures in Java, there would be nicer bridges for interoperability, to say
nothing of implementing closures on top of method handles.
But the point is not calling or using these things from Java; the point is using them, down near the metal, to assemble the next 700 witty and winsome programming languages.



At the risk of sounding like I don't get out much, I'm really pleased to see the progress being made on JVM, or should I say MLVM, changes.
As someone who uses one language evolution of Java that leverages the VM, AspectJ, I think that the VM changes are *the* crucial stream of work on Java 7, especially where the VM forces less efficient approaches.
Keep up the great work!
Posted by Neale on April 18, 2008 at 06:08 AM PDT #
Compiler error: You use MethodHandle.getType() in the pseudo code but you have defined MethodHandle.type() in the API sketch.
Posted by Andrea Francia on April 18, 2008 at 12:05 PM PDT #
Neale: Thanks. Working on it...
Andrea: Grazie. I simplified getType => type.
There's some conversation on this over at http://groups.google.com/group/jvm-languages/t/f8df67386ad3c17d .
Posted by John Rose on April 18, 2008 at 01:05 PM PDT #
Great Work.
Are you folks also considering a way to extend classes in Java the way "prototype" extends a class in JavaScript?
Posted by Mayur Patel on October 06, 2008 at 11:49 AM PDT #