Kelly O'Hair's Weblog (blogs.sun.com)
Friday Feb 29, 2008
A Short java_crw_demo User Guide
The java_crw_demo library is provided as a native code (C) demo of a BCI library that can instrument class files. It is just a demo, but an operating demo in that it has been used in the hprof VM agent and various JVM TI demo agents delivered in the OpenJDK JVM TI Demo Sources (typically the built versions of these demos are in the JDK installed on your system in the demo/jvmti directory). The complete code to java_crw_demo can be found in the OpenJDK Mercurial Repository. In particular, the #include file java_crw_demo.h is of primary interest.
A complete description of the class file format can be found in Chapter 4 of the Java Virtual Machine Specification (or look at the wikipedia entry on the class file format). Only part of the class file needs to be modified for basic instrumentation: the Code attribute (including the max-stack field), the constant pool, the Exception Table, the LineNumberTable, the LocalVariableTable, the LocalVariableTypeTable, the StackMapTable, and any StackMap attributes. The java_crw_demo does not add methods or fields and does not change the exception data except to adjust the pc offsets or byte offsets in the exception table. In fact the basic Table changes are just the pc or byteoffset adjustments. Any instrumentation more detailed than this would best be done with something other than a demo library like java_crw_demo, something more like ASM or BCEL would make much more sense.
Please keep in mind this is just a demo library, and fairly primitive in what it can do. It's functionality was driven by what hprof needed to do, e.g. instrument method entry and exit, and instrument memory allocations.
Basically there are just two functions in this library:
void java_crw_demo(
unsigned class_number,
const char * name,
const unsigned char * file_image,
long file_len,
int system_class,
char * tclass_name,
char * tclass_sig,
char * call_name,
char * call_sig,
char * return_name,
char * return_sig,
char * obj_init_name,
char * obj_init_sig,
char * newarray_name,
char * newarray_sig,
unsigned char ** pnew_file_image,
long * pnew_file_len,
FatalErrorHandler fatal_error_handler,
MethodNumberRegister mnum_callback);
char * java_crw_demo_classname(
const unsigned char * file_image,
long file_len,
FatalErrorHandler fatal_error_handler);
The java_crw_demo_classname method is used to extract out the classname from
a class file.
In some cases classes are loaded into the VM without a name (see
defineClass method in java.lang.ClassLoader.)
The java_crw_demo is the function you can call with a class file image and get
back an instrumented class file image. The arguments are defined below:
-
unsigned class_number
A unique identifying number for this class in your agent (you get to define what this will mean). This number will be given back to you when the instrumentation code is executed for method calls and method returns. If you are not doing method call or method return instrumentation, this has little value. It is assumed that you would have some kind of table in the agent code that could map this class number to the class name and it's method tables if needed. is ess -
const char * name
The name of the class in the form "java/lang/Object". -
const unsigned char * file_image
The class file image. -
long file_len
The number of bytes in the file_image. -
int system_class
Set to non-zero if this class is one that is loaded very early in the VM startup. Great care needs to happen with modifying these classes during VM startup. -
char * tclass_name
The name of the Tracker class that will have the static methods we will call as part of the instrumentation code. -
char * tclass_sig
The class signature for the Tracker class. -
char * call_name
The name of the static method in the Tracker class that will be used for method entries or indications of method calls. -
char * call_sig
The method signature for the call_name method. -
char * return_name
The name of the static method in the Tracker class that will be used for method exits or indications of method returns. -
char * return_sig
The method signature for the return_name method. -
char * obj_init_name
The name of the static method in the Tracker class that will be used for object allocations. -
char * obj_init_sig
The method signature for the obj_init_name method. -
char * newarray_name
The name of the static method in the Tracker class that will be used for array allocations. -
char * newarray_sig
The method signature for the newarray_name method. -
unsigned char ** pnew_file_image
If instrumentation happens, this will be a pointer to the new instrumented class file image, malloc() space. -
long * pnew_file_len
The length of the new class file image returned in *pnew_file_image. -
FatalErrorHandler fatal_error_handler
If non NULL, provides a function to call when fatal errors are encountered while parsing or creating the new class file image. -
MethodNumberRegister mnum_callback
If non NULL, provides a callback function to get access to the method names and signatures in the class. This returns the class number you supplied plus arrays of method names and signatures plus a count of those methods. These method numbers (index into the arrays is the method number) are passed into the instrumented method calls, baked into the instrumentation.
It's assumed that the JVM TI agent code would request some kind of class load event, a good example is the heapTracker demo. When it gets a CLASS FILE LOAD HOOK event, it effectively passes in the class image to java_crw_demo:
newImage = NULL;
newLength = 0;
java_crw_demo(cnum, classname,
class_data, class_data_len, systemClass,
"HeapTracker", "LHeapTracker;",
NULL, NULL,
NULL, NULL,
"newobj", "(Ljava/lang/Object;)V",
"newarr", "(Ljava/lang/Object;)V",
&newImage, &newLength,
NULL, NULL);
Which only does instrumentation for object allocations and doesn't use the callbacks.
This demo doesn't fully use java_crw_demo like hprof, the hprof Tracker class is a complete Tracker class, while the heapTracker class is just a partial tracking class. The VM agent needs to implement and register the native methods for these Tracker classes.
The Tracker class doesn't have to use native methods, but since hprof was a native code agent, and most VM agents are native code, somehow the information captured via the class file instrumentation needs to get back into the native agent anyway.
A pure java VM agent via the java agent mechanisms is probably a better way to go, but at this time I don't have a simple demo of the java agent.
-kto
Posted at 06:24PM Feb 29, 2008 by kto in Java | Comments[2]












You're the best source for BCI using C language. I'm interested in adding a static method in the Tracker class that will be used for indications of method invocations. I would like to track inside the method to be instrumented whenever this method invokes a particular method of a particular class. Any tips'n'tricks for adding to java_crw_demo a method that will be called when invoke opcodes are encountered in the code of the method to be instrumented? Basically, I want to instrument the methods of a class in order to be able to track when these methods invoke a particular method of a particular class. Would you have any suggestions to this problem?
Posted by lau on March 25, 2008 at 01:26 AM PDT #
Modifying the bytecodes is tricky to get right. I'd write a little C program that accepts a classfile and creates a classfile, to test the jprt_crw_demo changes you will make. Then use the 'javap' utility to dump out the class files and compare them. It has an option to dump everything in the classfile, use that.
Then use your test program over all of the jdk classes as a test to make sure it doesn't fall over. Maybe create a new say, tools.jar with all the classes modified and see if it still works.
Don't forget the impact on max_stack, it's crude now and may be ok as is, depends on how the stack is impacted by your new bytecodes.
The modified Tracker class needs to be in the boot classpath.
Good Luck.
-kto
Posted by Kelly O'Hair on March 25, 2008 at 12:42 PM PDT #