Tuesday May 19, 2009

Locations of visitors to this page

Dimitris Andreou suggested that I try the Boyer Moore string search algorithm (instead of String.search) to see how it would impact the fast grep described earlier. It turns out that while Boyer Moore searching takes much less time than String.contains, the bottleneck in the fast grep is not in the searching step. But still, considering that BoyerMoore is faster, I have incorporated it in the fast grep.

First, here is the source code of a class that performs BoyerMoore searching (initially copied from here):

package com.subhajit.util.grep;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

/**
* String matching Boyer-Moore. Copied from
* http://es.geocities.com/luisja80/rep/StringMatchingBM.java, followed by
* refactoring.
*/
public class BoyerMoore {

public static final int ALPHABET_SIZE = 256 ;
private int [] bmBC;
private int [] bmGs;
public int comparisons;

private final String pattern;
private final char [] patternChars;
private final int patternCharsLen;

public BoyerMoore ( String pattern ) {
super () ;
this .pattern = pattern;
patternChars = this .pattern.toCharArray () ;
patternCharsLen = patternChars.length; // pattern length
preBmBc ( patternChars ) ;
preBmGs ( patternChars ) ;
}

public void preBmBc ( char [] x ) {

int m = x.length;
bmBC = new int [ ALPHABET_SIZE ] ;

Arrays.fill ( bmBC, m ) ;

for ( int i = 0 ; i < m - 1 ; i++ ) {
bmBC [ x [ i ]] = m - i - 1 ;
}
}

public int [] suffixes ( char [] x ) {

int m = x.length;
int suff [] = new int [ m ] ;
int g = m - 1 ;
int f = m - 1 ;

suff [ m - 1 ] = m;

for ( int i = m - 2 ; i >= 0 ; --i ) {
if ( i > g && ( i + m - 1 - f ) < m && suff [ i + m - 1 - f ] < i - g ) {
suff [ i ] = suff [ i + m - 1 - f ] ;
} else {
g = i;
f = g;

while ( g >= 0 && x [ g ] == x [ g + m - 1 - f ]) {
--g;
}

suff [ i ] = f - g;
}
}

return suff;
}

public void preBmGs ( char [] x ) {

int m = x.length;
bmGs = new int [ m ] ;

int suff [] = suffixes ( x ) ;

Arrays.fill ( bmGs, m ) ;

int j = 0 ;

for ( int i = m - 1 ; i >= - 1 ; --i ) {
if ( i == - 1 || suff [ i ] == i + 1 ) {
for ( ; j < m - 1 - i; ++j ) {
if ( bmGs [ j ] == m ) {
bmGs [ j ] = m - 1 - i;
}
}
}
}

for ( int i = 0 ; i < m - 1 ; i++ ) {
bmGs [ m - 1 - suff [ i ]] = m - 1 - i;
}

}

public List<Integer> search ( String text ) {
char [] textChars = text.toCharArray () ;
int textLen = textChars.length; // string length
List<Integer> resultado = new ArrayList<Integer> () ;

int j = 0 ;
int i = 0 ;
comparisons = 0 ;

while ( j <= textLen - patternCharsLen ) {
for ( i = patternCharsLen - 1 ; i >= 0
&& patternChars [ i ] == textChars [ i + j ] ; i-- ) {
comparisons++;
}

if ( i < 0 ) {
resultado.add ( j ) ;
j += bmGs [ 0 ] ;
} else {
j += Math.max ( bmGs [ i ] , bmBC [ textChars [ i + j ]] - patternCharsLen
+ 1 + i ) ;
}
}

return resultado;
}

public boolean isPresentIn ( String text ) {
char [] textChars = text.toCharArray () ;
int textLen = textChars.length; // string length

int j = 0 ;
int i = 0 ;
comparisons = 0 ;

while ( j <= textLen - patternCharsLen ) {
for ( i = patternCharsLen - 1 ; i >= 0
&& patternChars [ i ] == textChars [ i + j ] ; i-- ) {
comparisons++;
}

if ( i < 0 ) {
return true ;
} else {
j += Math.max ( bmGs [ i ] , bmBC [ textChars [ i + j ]] - patternCharsLen
+ 1 + i ) ;
}
}

return false ;
}
}

The original source code has been changed in the following ways:

1. The cost of initializing a "BoyerMoore" object has been moved to the constructor.

2. Several key data members have been declared final with a view to making the "match" method thread safe.

3. A new method "isPresentIn" has been added that returns whether the pattern being sought is present in the given text or not, without returning all the positions where it occurs (as does the "match" method).

4. Some variables have been renamed for easier understanding.

The Grep method has been changed to use "BoyerMoore.isPresentIn(...)" instead of String.contains(...):

// Setup the output queue containing FileContent objects
// representing matches. The output queue is populated by the file
// finders object.
final BoyerMoore bm = new BoyerMoore ( text ) ;
final BlockingQueue<FileContent> outputQueue = new LinkedBlockingQueue<FileContent> () ;
final QueueProcessor<FileContent, FileContent> fileFinders = new QueueProcessor<FileContent, FileContent> (
contentQueue, outputQueue, 5 ,
new IProcessor<FileContent, FileContent> () {
public FileContent process ( FileContent input )
throws InterruptedException,
InvocationTargetException {
if ( bm.isPresentIn ( input.getContent ()) ){
return input;
} else {
return null ;
}
}
}) ;

As I said, this makes a marginal improvement in the overall execution of grep on my setup. Your mileage might vary.

Wednesday May 13, 2009

Dimitris Andreou commented on my previous posting on fast directory scanning by suggesting that the individual threads doing the work of listing directories were not evenly loaded. In other words, the solution presented did not represent an even partitioning of the work between the threads. The uneven distribution results from the files and directories each directory contains: the more this number, the more “work” the thread that processes the directory has to do.

To understand how to didvide the work between the threads, it is important to note that these threads do two things: obtain directory listings (via File.listFiles()) and examine each entry in the array of files returned by that method. The first method (listFiles()) is a library method which must be used as it is (which is another way to say that it cannot be subdivided into constituent parts at the application level). The other task performed by each thread, that is the examination of each directory entry, is the only candidate for partitioning.

One obvious way to perform this partitioning is to have each thread examine a fixed number of directory listing entries, instead of all of these. This fixed number, called an “installment”, is an input parameter to the directory scan. Accordingly, I wrote the following class to “fairly” load the threads. Note that the FairScanner class presented below still does not guarantee that each thread examines the same number of directory elements: all it does is increase the likelihood that this happens at runtime.

Without further ado, here is the code:

public class FairScanner {
    private final Semaphore sem = new Semaphore(1);
    private final AtomicInteger waitingCount = new AtomicInteger(0);
    private final List<ThreadWithList> threads = new ArrayList<ThreadWithList>();
    private final int installmentSize;

    private static class DirInfo {
        private final File dir;
        private final File[] listing;
        private int index;

        public DirInfo(File dir) {
            super();
            this.dir = dir;
            this.listing = dir.listFiles();
            this.index = 0;
        }

        public int getIndex() {
            return index;
        }

        public void setIndex(int index) {
            this.index = index;
        }

        public File getDir() {
            return dir;
        }

        public File[] getListing() {
            return listing;
        }
    }

    private final BlockingQueue<DirInfo> queue;

    private final class ThreadWithList extends Thread {
        private final List<File> files;
        private int useCount = 0;

        public ThreadWithList() {
            super();
            files = new ArrayList<File>();
        }

        public List<File> files() {
            return this.files;
        }

        public void incrementUseCount() {
            useCount++;
        }

        public int getUseCount() {
            return useCount;
        }

        public void run() {
            while (true) {
                if (FairScanner.this.scan0()) {
                    break;
                }
            }
        }
    }

    private FairScanner(int threads, int installmentSize) {
        super();
        this.installmentSize = installmentSize;
        queue = new LinkedBlockingQueue<DirInfo>();
        for (int i = 0; i < threads; i++) {
            ThreadWithList t = new ThreadWithList();
            this.threads.add(t);
        }
    }

    public Collection<File> scan(final File dir) throws InterruptedException,
            ExecutionException {
        sem.acquire();
        queue.add(new DirInfo(dir));
        for (ThreadWithList t : threads) {
            t.start();
        }
        sem.acquire();
        List<File> ret = new ArrayList<File>();
        for (ThreadWithList t : threads) {
            ret.addAll(t.files);
        }
        return ret;
    }

    private boolean scan0() {
        waitingCount.incrementAndGet();
        // Remove the next item from the queue.
        ThreadWithList thread = ((ThreadWithList) Thread.currentThread());
        DirInfo dirInfo = null;
        try {
            if (waitingCount.get() == threads.size() && queue.isEmpty()) {
                sem.release();
                return true;
            }
            dirInfo = queue.take();
        } catch (InterruptedException exc) {
            Thread.currentThread().interrupt();
            return true;
        } finally {
            waitingCount.decrementAndGet();
        }

        int index = dirInfo.getIndex();
        File[] listing = dirInfo.getListing();
        int upperBound = Math.min(index + installmentSize, listing.length);
        for (int i = index; i < upperBound; i++) {
            thread.files.add(listing[i]);
            if (listing[i].isDirectory()) {
                DirInfo subdirInfo = new DirInfo(listing[i]);
                try {
                    queue.put(subdirInfo);
                } catch (InterruptedException exc) {
                    Thread.currentThread().interrupt();
                    return true;
                }
            }
        }
        if (upperBound != listing.length) {
            dirInfo.setIndex(upperBound);
            queue.add(dirInfo);
        }
        thread.useCount += (upperBound - index);

        return false;
    }

    public void close() {
        for (ThreadWithList t : threads) {
            if (t.isAlive()) {
                t.interrupt();
            }
        }
        for (ThreadWithList t : threads) {
            try {
                t.files.clear();
                t.join();
                System.out.print(t.getUseCount() + " ");
            } catch (InterruptedException exc) {
                Thread.currentThread().interrupt();
                return;
            }
        }
        System.out.println();
        queue.clear();
    }

    private static final List<FairScanner> scanners = new ArrayList<FairScanner>();

    public static Collection<File> listAllContentsUnder(File dir, int threads,
            int installmentSize) throws InterruptedException,
            ExecutionException {
        FairScanner scanner = new FairScanner(threads, installmentSize);
        scanners.add(scanner);
        return scanner.scan(dir);
    }

    public static void flush() {
        for (FairScanner scanner : scanners) {
            scanner.close();
        }
    }

}

Note the following items about this class:

1. There is a bounded queue (named “queue”) that holds “DirInfo” objects. These objects contain information about a directory such as a reference to the directory, an array of files representing the directory listing, and an “index” into this array representing the next item to be examined.

2. There is a custom Thread (called “ThreadWithList”) that endlessly loops and calls “scan0”.

3. The “scan0” method (which is endlessly called by instances of ThreadWithList), takes a DirInfo object from the queue, examines the next so many elements of the list of files, adds entries to the list of files, and adds new DirInfo objects to the queue for those entries that happen to be directories.

4. The scan is deemed to be complete when all the threads are waiting to take an element from the queue, and the queue is empty.

5. There are two arguments that can be used to fine tune the behavior of the FairScanner: the “threads” argument specifies the number of threads to use, and the “installmentSize” argument specifies the number of directory listing elements each thread examines at a time.

I have changed the “main” method used to test these classes as follows:

public static void main(String[] args) {
    int status = 0;
    try {
        Set<File> files0 = new HashSet<File>();
        Collection<File> files1 = null;

        for (int threads = 5; threads < 20; threads++) {
            for (int installmentSize = 15; installmentSize < 20; installmentSize++) {
                int iterations = 5;
                long t0 = System.nanoTime();
                for (int i = 0; i < iterations; i++) {
                    files0 = SerialDirScanner
                            .listAllContentsUnder(new File(args[0]));
                }
                t0 = System.nanoTime() - t0;
                long ticks0 = (t0 / 10);
                System.out.println((t0 / 10) + " ticks.");
                t0 = 0;
                for (int i = 0; i < iterations; i++) {
                    long t1 = System.nanoTime();
                    files1 = FairScanner.listAllContentsUnder(new File(
                            args[0]), threads, installmentSize);
                    t1 = System.nanoTime() - t1;
                    t0 += t1;
                    FairScanner.flush();
                }
                System.out.println("Threads = " + threads
                        + ",InstallmentSize = " + installmentSize);
                long ticks1 = (t0 / 10);
                System.out.println((t0 / 10) + " ticks.");
                System.out.println(files0.size() + "," + files1.size());
                System.out.println(((ticks1 * 100) / ticks0) + " %");
                System.out.println();
            }
        }
    } catch (Throwable exc) {
        status = 1;
        exc.printStackTrace();
    } finally {
        System.exit(status);
    }
}

Observations

Some general observations made so far:

1. Increasing the installment size to a size of fifteen to twenty results in the best performance.

2. A thread count of about ten seems to result in the best performance.

I will post detailed data once I collect it.

Finally, note that this is a work in progress, so please do suggest improvements, shortcomings, etc.. Thanks to Josef and Dimitris for suggesting improvements.

Tuesday May 12, 2009

After posting the "DirScanner" class for fast directory scanning this morning (see my earlier post), some possible improvements to DirScanner occured to me.

One "low hanging fruit" was the common concurrent map shared between all the threads in DirScanner, and its potential to serve as a "point of contention". Note that since all threads write to this map, it is (internally, and partially) locked each time it is updated. The concurrent hash map uses lock striping to alleviate contention, but does not eliminate it altogether.

In the improved version shown below, each thread (of the fixed thread pool) is a custom thread (ThreadWithList) containing a list of files. (Josef, a reader, also mentioned this in his comment to my earlier posting). The "scan0" method, which is called within a thread, can thus safely add to the particular thread's list of files without any synchronization.

There is an instance method named "listOfLists", and each ThreadWithList adds a reference to its instance member "list" to listOfLists.

Once all the work is done, the "scan" method simply iterates through the list and accumulates and returns all of the entries in each list.

Ok, so here is the source code of the DirScanner3 class:

public class DirScanner3 {
    private final ExecutorService executor;
    private final Semaphore sem = new Semaphore(1);
    private final AtomicInteger count = new AtomicInteger();

    private final List<List<File>> listOfLists = new ArrayList<List<File>>();

    /**
     * Custom {@link Thread} returned by the {@link ThreadFactory} used by the
     * {@link ExecutorService} data member <tt>executor</tt>.
     */
    private final class ThreadWithList extends Thread {
        private final List<File> files;

        public ThreadWithList(Runnable r) {
            super(r);
            files = new ArrayList<File>();
            DirScanner3.this.listOfLists.add(files);
        }

        public List<File> files() {
            return this.files;
        }
    }

    private DirScanner3(int threads) {
        super();
        executor = Executors.newFixedThreadPool(threads, new ThreadFactory() {
            public Thread newThread(Runnable r) {
                return new ThreadWithList(r);
            }
        });
    }

    public Collection<File> scan(final File dir) throws InterruptedException,
            ExecutionException {
        sem.acquire();
        executor.submit(new Runnable() {
            public void run() {
                scan0(dir);
            }
        });
        sem.acquire();
        List<File> ret = new ArrayList<File>();
        for (List<File> files : listOfLists) {
            ret.addAll(files);
        }
        return ret;
    }

    private void scan0(File dir) {
        for (final File file : dir.listFiles()) {
            ((ThreadWithList) Thread.currentThread()).files.add(file);
            if (file.isDirectory()) {
                count.incrementAndGet();
                executor.submit(new Runnable() {
                    public void run() {
                        DirScanner3.this.scan0(file);
                    }
                });
            }
        }
        if (count.decrementAndGet() < 0) {
            sem.release();
        }
    }

    public void close() {
        executor.shutdown();
    }

    public static Collection<File> listAllContentsUnder(File dir, int threads)
            throws InterruptedException, ExecutionException {
        DirScanner3 scanner = new DirScanner3(threads);
        try {
            return scanner.scan(dir);
        } finally {
            scanner.close();
        }
    }
}

It is important to note that DirScanner3 cannot be instantiated. Instead, it must be used via the listAllContentsUnder static method. I am doing this to save time clearing each element of listOfLists, and resetting the counter to zero, in the “scan” method. Perhaps this is a bad idea, and I will have to think about this some more.

In the meantime, please enjoy and keep the comments coming.

In this post, I present two approaches to directory scanning, one serial and one parallel. Both approaches return a collection of all files and directories contained by a given directory, including the contents (recursively) of each sub-directory thereof. For example, scanning the following directory tree:

…/dir/

…/dir/picture0.jpg

…/dir/picture1.jpg

…/dir/vacation-pictures/

…dir/vacation-pictures/grand-canyon1.jpg

returns a collection containing four entries, viz. …/dir/picture0.jpg, …/dir/picture1.jpg, …/dir/vacation-pictures/ and …/dir/vacation-pictures/grand-canyon1.jpg.

Serial Scanning

In serial scanning, a recursive method lists all files in a given directory, adds these to a list, and, if the file being added is a directory, invokes itself: The SerialDirScanner class, shown below, performs this task:

public class SerialDirScanner {
    private final ConcurrentMap<File, Boolean> files = new ConcurrentHashMap<File, Boolean>();

    public SerialDirScanner() {
        super();
    }

    public Set<File> scan(File dir) throws InterruptedException,
            ExecutionException {
        scan0(dir);
        return files.keySet();
    }

    private void scan0(File dir) {
        for (final File file : dir.listFiles()) {
            files.putIfAbsent(file, Boolean.TRUE);
            if (file.isDirectory()) {
                scan0(file);
            }
        }
    } 
    public static Set<File> listAllContentsUnder(File dir)
            throws InterruptedException, ExecutionException {
        SerialDirScanner scanner = new SerialDirScanner();
        return scanner.scan(dir);
    }
}

The “scan” method invokes the recursive “scan0” method, which does the real work of scanning directories. That method fills up a concurrent map with files and directories as it works. Finally, the “scan” method returns all keys of the concurrent map.

Parallel scanning

Looking at the “scan0” method of the serial scanner shown above, many opportunities for parallelization become apparent. For example, the listing of different sibling directories (that is, directories that both have the same parent) can proceed in parallel, as can the listing of directories that are unrelated to each other (that is, one directory is not an ancestor of the other). The “DirScan” class shown below performs parallel scanning:

public class DirScanner {
    private final ConcurrentMap<File, Boolean> files = new ConcurrentHashMap<File, Boolean>();
    private final ExecutorService executor;
    private final Semaphore sem = new Semaphore(1);
    private final AtomicInteger count = new AtomicInteger();

    public DirScanner(int threads) {
        super();
        executor = Executors.newFixedThreadPool(threads);
    }

    public Set<File> scan(File dir) throws InterruptedException,
            ExecutionException {
        sem.acquire();
        scan0(dir);
        sem.acquire();
        return files.keySet();
    }

    private void scan0(File dir) {
        for (final File file : dir.listFiles()) {
            files.putIfAbsent(file, Boolean.TRUE);
            if (file.isDirectory()) {
                count.incrementAndGet();
                executor.submit(new Runnable() {
                    public void run() {
                        DirScanner.this.scan0(file);
                    }
                });
            }
        }
        if (count.decrementAndGet() < 0) {
            sem.release();
        }
    }
    public void close(){
        executor.shutdown();
    }
    public static Set<File> listAllContentsUnder(File dir, int threads)
            throws InterruptedException, ExecutionException {
        DirScanner scanner = new DirScanner(threads);
        return scanner.scan(dir);
    }
}

In common with the “SerialDirScanner” class, the “DirScanner” class has a public method named “scan” that takes a directory as an argument and returns a Set of File objects. The “scan” method acquires a semaphore of count 1, calls the “scan0” method, and waits to re-acquire the same semaphore. The “scan0” method performs the actual work of scanning directory contents. The “DirScanner” class contains an ExecutorService object named “executor” that is used to schedule scanning tasks.

Note that the “DirScanner” class accepts an integer constructor parameter named “threads”. This parameter is used to initialize the ExecutorService with a fixed thread pool of this size.

The “scan0” method submits a new task to the executor whenever it finds a sub-directory while performing a directory listing. The new “task”, in this case, happens to be a recursive invocation of itself. The “scan0” method increments a counter (an AtomicInteger instance object named “counter”) before the task is actually submitted. Each time the “scan0” method completes listing a directory, it decrements the counter, releasing the semaphore when the counter’s value falls below zero. This allows the “scan” method, which has been waiting to acquire this semaphore, to “wake up” and continue, effectively returning a Set of File objects from the key set of the concurrent map “files”.

Performance comparison and thread counts

To test the performance improvement, if any, of the parallelized directory scanner vis-a-vis the serial scanner, I wrote the following “main” method:

public static void main(String[] args) {
    int status = 0;
    try {
        Set<File> files0 = null;
        Set<File> files1 = null;
        long t0 = System.nanoTime();
        files0 = SerialDirScanner
                .listAllContentsUnder(new File(args[0]));
        t0 = System.nanoTime() - t0;
        System.out.println(t0 + " ticks.");
        t0 = System.nanoTime();
        files1 = DirScanner.listAllContentsUnder(new File(args[0]), 5);
        t0 = System.nanoTime() - t0;
        System.out.println(t0 + " ticks.");
        System.out.println(files0.size() + "," + files1.size());
        System.out.println(files0.size() == files1.size());
    } catch (Throwable exc) {
        status = 1;
        exc.printStackTrace();
    } finally {
        System.exit(status);
    }
}

The argument I pass in to the main method is a directory that is well populated and quite deep. Running this on my development laptop (with the specs Core 2 Duo 2.26 MHz, 4GB, running Windows 7 Evaluation build No. 7100), I get:

1. Initially, before the directory structure is cached by the operating system:

3666120093 ticks.
2375654123 ticks.
39592,39592
true

2. Subsequently, after the directory structure has been cached:

3551936784 ticks.
2340802880 ticks.
39592,39592
true

So, there is a performance gain of roughly 33% using the parallel scanner using five threads (note that the “listAllContentsUnder” method accepts a thread count as the second argument). Using three threads, I get:

3594733052 ticks.
2335042793 ticks.
39592,39592
true

which is not much different from the readings obtained from using five threads. Using nine threads, I get:

3563738170 ticks.
2368012499 ticks.
39592,39592
true

which shows that the performance has actually been reduced by increasing the number of threads. The explanation for this curious observation is that the concurrent map, into which all threads write data, faces increased contention as the number of threads goes up.

Another interesting observation is that the parallel scanner is actually slower at scanning shallow directories. Consider the following output while scanning an example:

2602454 ticks.
10136304 ticks.
17,17
true

Conclusion

Using the parallel directory scanner speeds up the scanning of deep (and well populated) directories by roughly 33%. Shallow directories are better scanned with the serial directory scanner.

Monday Apr 20, 2009

I improved URLClassLoaderX (see post) using the ideas presented in a later post about supporting custom URL’s) . The improved (and much shortened source code) is given here.

Here is a list of the improvements:

1. After removing the entire bunch of classes involved with managing path urls, there are just five classes left.

2. I renamed IURLManagementStrategy to ResourceManager. The implementation class is renamed to ResourceManagerImpl.

3. Two new classes, namely DirectoryEntryContent and JarEntryContent, to deal with directories and jar files,respectively, have been added.

Again, the updated source code may be downloaded here.

Introduction

In a prior post, I described an alternative to URLClassLoader (see that post) named URLClassLoaderX. While implementing URLClassLoaderX, I had to build in support for URL's with a custom protocol (“path”). After finishing that implementation, I was not completely happy with the way I had to build ad-hoc support for the “path” protocol. Since that post was not primarily concerned with building support for custom URL protocols, I did not spend too much time improving the code supporting this feature in that post. In the following post, I present a set of classes illustrating how to rapidly build support for generic URL's.

First of all, what exactly do I mean by “custom” or “generic” URL's? Taking a look at URL support ready built into Java, one finds that certain URL protocols (such as “file:” and “http:”) are supported out of the box. So, one can directly construct an URL for a file and an HTTP site as follows:

URL fileURL = new URL(“file:/C:/var/tmp”);

URL httpURL = new URL(“http://www.cnn.com”);

But try constructing an URL using “new URL(“mem:foo”)”, and you get a MalformedURLException. It turns out that some extra work is required to support such “custom” URL's (see the documentation for “java.net.URLStreamHandlerFactory” for more information). Basically, one has to create a class implementing the URLStreamHandlerFactory class, and create another class that extends the URLConnection class. In the code presented in this post, I illustrate a scheme that minimizes the number of classes that must be implemented to build this support.

There are two central classes in the codebase presented below, namely an interface named “com.subhajit.url.Content” and a class named “com.subhajit.url.GenericURLStreamHandlerFactory” that implements URLStreamHandlerFactory.

The Content interface, shown below, describes objects that are capable of accessing some data from somewhere:

public interface Content {
    /**
     * Returns an {@link InputStream} from which the content of this object can
     * be read.
     *
     * @return {@link InputStream} to the content of this object.
     * @throws IOException
     */
    InputStream openStream() throws IOException;

    /**
     * Returns a reference to the content of this object that is of a type equal
     * to the first match of the given {@link Class}s, or <tt>null</tt> if the
     * content cannot be returned as any of the given {@link Class}s.
     *
     * @param klass
     * @return
     * @throws IOException
     */
    Object getContent(Class<?>... klass) throws IOException;

    /**
     * Returns the {@link URL} representing this object.
     *
     * @return
     * @throws MalformedURLException
     */
    URL toURL() throws MalformedURLException;

    /**
     * Establishes a connection to the content of this object.
     *
     * @throws IOException
     */
    void connect() throws IOException;
}

The Content interface has a default abstract implementation named “AbstractContent”, which defines some abstract methods that sub classes must implement, and some methods that subclasses may optionally override:

public abstract class AbstractContent implements Content {
    protected abstract String getProtocol();

    protected abstract String getUrlFile();

    protected URL url;

    protected String getHost() {
        return null;
    }

    protected int getPort() {
        return 0;
    }
    public void connect() throws IOException {
    }

    public final URL toURL() throws MalformedURLException {
        if (url == null) {
            url = new URL(getProtocol(), getHost(), getPort(),
                    getUrlFile(),
                    new GenericURLStreamHandlerFactory()
                            .createURLStreamHandler(
                                    getProtocol()));
            new URLMapperFactory().getURLMapper().putData(
                    url.toString(), this);
        }
        return url;
    }
}

The GenericURLStreamHandlerFactory class, which is another piece of the puzzle, is shown below:

public class GenericURLStreamHandlerFactory implements URLStreamHandlerFactory {
    public URLStreamHandler createURLStreamHandler(String protocol) {
        return new URLStreamHandler() {
            @Override
            protected URLConnection openConnection(URL url) throws IOException {
                final Content data = new URLMapperFactory().getURLMapper()
                        .getData(url);
                return new URLConnection(url) {
                    @Override
                    public void connect() throws IOException {
                        data.connect();
                    }

                    @Override
                    public Object getContent() throws IOException {
                        return data.getContent();
                    }

                    @SuppressWarnings("unchecked")
                    @Override
                    public Object getContent(Class[] classes)
                            throws IOException {
                        return data.getContent(classes);
                    }

                    @Override
                    public InputStream getInputStream() throws IOException {
                        return data.openStream();
                    }
                };
            }
        };
    }
}

Note how the overridden methods of this class delegate invocations to the enclosed “data” object, and how the “data” object is obtained from the URL via an “URLMapperFactory” class.

The URLMapperFactory class creates URLMapper classes, which map URL's to Content objects. URLMapper is actually an interface:

public interface URLMapper {
    Content getData(URL url);
    public void putData(String urlStr, Content data);   
}

The default implementation of this interface is GenericURLMapper, which is a glorified wrapper around a concurrent map:

class GenericURLMapper implements URLMapper {
    private static final GenericURLMapper instance = new GenericURLMapper();
    private final ConcurrentMap<String, Content> map = new ConcurrentHashMap<String, Content>();

    public Content getData(URL url) {
        return map.get(url.toString());
    }

    public void putData(String urlStr, Content data) {
        map.put(urlStr, data);
    }

    public static URLMapper getInstance() {
        return instance;
    }
}

Application

So, given these classes, how do you go about implementing a custom URL? It turns out that all you have to do is implement the Content interface, as shown below. But first, here is an example of how you might create a custom URL with the "jarentry" protocol:

URL jarEntryUrl = new JarEntryContent( new File( "pathToJarFile" ),"/my/company/classes/Class1.class").toURL();

Note that custom URL's are obtained indirectly from their Content objects, not directly constructed.

As an example of implementing the Content interface, let us assume that you wish to implement URL's with a “mem” protocol. These URL's access content stored in memory. The “MemoryContent” class shown below shows you how to do this:

public class MemoryContent extends AbstractContent implements Content {
    private final byte[] bytes;
    public MemoryContent(byte[] bytes) {
        super();
        this.bytes = bytes;
    }

    public Object getContent(Class<?>... klasses) throws IOException {
        if (klasses.length == 0) {
            return openStream();
        }
        for (Class<?> klass : klasses) {
            if (klass == byte[].class) {
                byte[] copy = new byte[bytes.length];
                System.arraycopy(bytes, 0, copy, 0, bytes.length);
                return copy;
            } else if (klass == InputStream.class) {
                return openStream();
            }
        }
        return null;
    }

    public InputStream openStream() throws IOException {
        return new ByteArrayInputStream(bytes);
    }

    @Override
    protected String getProtocol() {
        return "mem";
    }

    @Override
    protected String getUrlFile() {
        return this.toString();
    }
}

As another example, the “JarEntryContent” class, shown below, provides access to the bytes of a named entry in a specified JAR file.

public class JarEntryContent extends AbstractContent implements Content {
    private final File jarFile;
    private final String entry;
    private final List<byte[]> holder;

    public JarEntryContent(File jarFile, String entry) {
        super();
        this.jarFile = jarFile;
        this.entry = entry;
        this.holder = new ArrayList<byte[]>();
    }

    private byte[] getBytes() throws ZipException, IOException {
        if (holder.isEmpty()) {
            ZipFile zipFile = null;
            try {
                zipFile = new ZipFile(jarFile);
                InputStream in = zipFile.getInputStream(new ZipEntry(entry));
                if (in == null) {
                    throw new IOException("Zip file contains no such entry - "
                            + jarFile.getAbsolutePath() + "\t" + entry);
                }
                try {
                    holder.add(StreamUtils.readFully(in));
                } finally {
                    in.close();
                }
            } finally {
                if (zipFile != null) {
                    zipFile.close();
                }
            }
        }
        return holder.get(0);
    }

    public Object getContent(Class<?>... klasses) throws IOException {
        if (klasses.length == 0) {
            return getContent(byte[].class);
        }
        for (Class<?> klass : klasses) {
            if (klass == byte[].class) {
                byte[] copy = new byte[getBytes().length];
                System.arraycopy(getBytes(), 0, copy, 0, copy.length);
                return copy;
            } else if (klass == InputStream.class) {
                return new ByteArrayInputStream(getBytes());
            }
        }
        return null;
    }

    public InputStream openStream() throws IOException {
        return (InputStream) getContent(InputStream.class);
    }

    @Override
    protected String getProtocol() {
        return "jarentry";
    }

    @Override
    protected String getUrlFile() {
        return jarFile.getName() + "#" + entry;
    }

}

Here are some code snippets showing how to use JarEntryContent:

URL jarEntryUrl = new JarEntryContent( new File( "pathToJarFile" ),"/my/company/classes/Class1.class").toURL();

Note that custom URL's are obtained indirectly from their Content objects, not directly constructed.

Source Code

Complete source code for these classes can be found here. To use, download the zip file into a temporary directory and unzip it. This should create the following files:

a) generic-url.jar, which contains a built version of these classes,

b) src.zip, which contains the source code, and

c) util-lite.jar, which is a small utility library that I use. I will be happy to post complete source code for util-lite.jar upon request.


Sunday Apr 19, 2009

Introduction

In this post, I present an alternative to java.net.URLCLassLoader that has some desirable properties that the stock URLClassLoader does not have. The class presented here, com.subhajit.urlclassloader.URLClassLoaderX, extends URLClassLoader, and can therefore be used as a replacement for URLClassLoader in your source code.

URLClassLoaderX has certain desirable traits that URLClassLoader does not have. URLClassLoader holds locks to its URL's during its lifetime, which makes it impossible to update these URL's on operating systems like Windows. (This drawback has been recognized, and is due to be fixed in JDK 7 via the introduction of a new “close” method). URLClassLoaderX, on the other hand, does not behave in this manner. It briefly inspects its URL's upon initialization, caching information about the elements present therein, during which time it does lock the URL's. Immediately following initialization, however, URLClassLoaderX releases locks on the URL's, so that these can be updated on the file system if desired.

The ability to update or delete URL's is important in applications where you generate source code in your applications, compile it, and then load the resulting classes. In this situation, you would typically save the source code in temporary files, and load the compiled byte code from another temporary directory. The trouble with URLClassLoader is that once you have loaded the temporary byte codes, you cannot get rid of the temporary directory from which the byte code was read as long as the URLClassLoader instance is alive (on operating systems like Windows which maintains mandatory locks on files). There might be other situations where you might need to delete temporary directories or files from which you have loaded classes, and once again, using URLClassLoader leads to “locked” files and directories.

As mentioned earlier, URLClassLoaderX does not lock files or directories from which it loads classes and resources for its entire lifetime. Instead, it creates an internal cache of the URL's and the paths of the resources they contain, during initialization. For example, if a directory URL contains two resources named “com/subhajit/test/A.class” and “com/subhajit/test/picture.gif”, URLClassLoaderX initializes its internal cache such that there is an entry for the URL containing a set of strings populated with “com/subhajit/test/A.class” and “com/subhajit/test/picture.gif”. When resources and classes must be loaded, it consults these entries to determine which URL's to load these from. A resource is loaded from the first URL (in the list of URL's used to initialize the URLClassLoaderX instance) in which it is found.

URLClassLoaderX uses a “strategy” instance to which it delegates most calls. Let us first look at the strategy class.

IURLManagementStrategy

The “strategy” instance must implement the “com.subhajit.urlclassloader.IURLManagementStrategy” interface:

public interface IURLManagementStrategy {

/**

* Finds the resource <tt>name</tt> by iterating through the list of

* {@link URL}s being managed, and returns a "path" URL encoding the URL in

* which the resource was found, and a path to the resource within that URL.

*

* @param name

* @return An encoded "path" {@link URL} to the resource if the resource is

* found in one of the {@link URL}s being managed, or null if the

* resource <tt>name</tt> is not found.

* @throws MalformedURLException

*/

URL findResource(String name) throws MalformedURLException;

/**

* Find the resource named <tt>name</tt> in all the {@link URL}s being

* managed.

*

* @param name

* @return

* @throws MalformedURLException

*/

Enumeration<URL> findResources(String name) throws MalformedURLException;

/**

* Resets this object so that it can be reused.

*/

void reset();

}

The source code provided in this post contains a default implementation of IURLManagerStrategy named (somewhat unimaginatively) DefaultURLManagementStrategy.

DefaultURLManagementStrategy contains a ConcurrentMap<URL,Set<String>> named “urlContentNames”, which is initialized when the object is constructed. The initialization code uses a java.util.concurrent.ExecutorService to concurrently open and parse the specified URL's.

The “findResource” method iterates through the URL's looking for the first entry in urlContentNames for which the corresponding Set<String> contains an entry for the resource name being sought. If such an entry is found, it constructs and returns a special URL called a “path” URL that points to the resource.

Path URL's are special URL's that have a protocol named “path”, and contain two pieces of information, viz. the URL in which the resource was found, and the location of the resource within that URL. An example of a path URL (in String form) is “path:file:/C:/var/clearcase/lite-all/tools-lite/url-class-loader-test/./test/temp/#com/subhajit/jgrep/Location.class”. Note the following characteristics of this URL:

  1. It has a protocol named “path”.

  2. The URL containing the resource may be constructed by stripping away the “path:” prefix, parsing the resulting string into tokens separated by “#”, and using the first resulting token (which, in this case, is “file:/C:/var/clearcase/lite-all/tools-lite/url-class-loader-test/./test/temp/”).

  3. The location of the resource is obtained by parsing the string into tokens separated by “#” and using the second token.

As with all custom URL protocols, the “path” protocol requires a set of helper classes in order to be used. These classes are present in the source code provided in the “com.subhajit.tools.urlclassloader” package (see “PathURLConnection”, “PathURLStreamHandler” and “PathURLStreamHandlerFactory”). As a refinement, the findResource method caches URL's generated for resources found in previous calls, for the benefit of subsequent calls to find these same resources.

Note that findResource does not actually open, load or cache resources: it just creates (path) URL's that may be used to refer to these resources.

URLClassLoaderX

Having described the strategy class, let us move on to the URLClassLoaderX class itself. This class implements two types of methods, viz. Those that are relevant to class and resource loading, and other methods such as “equals”, “hasCode”, etc.

URLClassLoaderX implements the following methods relating to class and resource loading:

  1. findResource

  2. findResources

Both methods directly delegate the call to the strategy object as shown below:

@Override

public URL findResource(String name) {

try {

return strategy.findResource(name);

} catch (MalformedURLException e) {

e.printStackTrace();

throw new RuntimeException(e);

}

}

@Override

public Enumeration<URL> findResources(String name) throws IOException {

return strategy.findResources(name);

}

Remaining work

At this time, URLClassLoaderX is functionally complete. The next step is to improve its performance. Some ideas for enhancement that come to find are better URL management strategies (perhaps those that lazily defer work as much as possible), and a way to allow users to specify different types of URL's (and their handler classes) in an open and flexible manner (currently, only “file” URL's are supported).

Using the source code

The source code is provided as a “zip” file named “url-class-loader.zip”. To use it, download it to a temporary directory and unzip it. This results in three files, viz. “src.zip” (which contains the source code), “url-class-loader.jar” (which contains a pre-built version of the source code, compiled for Java 1.5) and “util-lite.jar”, which is a lightweight utility library that I have written. I will be happy to post complete source code for the “util-lite.jar” file if requested. The source code is available here.

Saturday Feb 21, 2009

Correction: The files you need to download to try out the profiler are:

ajweaver-agent.jar

ajweaver-agent-profiler.jar

The main body of the article contains incorrect links to these files. Sorry for the confusion caused.

In this post, I present a Java agent that calls back a custom listener class (which you provide) whenever methods in a set of classes (which you specify via configuration) are invoked. The only requirement on the listener class is that it contain a public static method with a specified signature:

public static void someMethod(int operation, String methodSignature, Object targetObject, Object…args), where:

operation “0” (start of method execution), “1” (normal end of method execution) or “2” (end of method execution with an uncaught exception)
methodSignature signature of the invoked method
targetObject object on which the method was invoked.
args variable number of arguments indicating the arguments the method was invoked with.

It is trivially simple writing listener classes. Here, in its entirety, is a “logging” listener:

public class SimpleListener{

public static void callme(int op, String signature, Object obj,
        Object... args) {

if ( op == 0 ){

System.out.println( signature );

}

}

}

Later, I provide a more involved (and useful) listener class in the code examples that allows the profiling of method executions.

What I really like about this project is that the agent is implemented using AspectJ, but AspectJ is fully embedded (“hidden” from view) from the user’s viewpoint. In other words, the user does not provide any resources, additional JAR’s or configuration that AspectJ requires, nor even “needs to know” that AspectJ is being used under the covers. Look at the “NotifierMain” and “PrepareAjWeaverInputs” and “Transformer” classes for details on how I did this. I do not go into this further in this post since this is irrelevant to the ideas I wish to discuss here.

Using the agent

To use the agent, you need the file “ajweaver-agent.jar” which can be downloaded from here. Next, you must provide some command line arguments to the command that launches your application, to wit “-javaagent:{absolute path to the ajweaver-agent.jar file}. Finally, you must provide a set of system properties that tell the agent what classes must have their method invocations reported, what your custom listener class is called, and what the name of the method is (in that class) that must be called back by the agent. For the example listener shown above (SimpleListener), these system properties would be defined as:

instrument-packages com.justthispackage.*,com.thispackageandallsubpackages..*,com.yet.another.package.*
(Note that the values are comma separated).
callback-class SimpleListener
callback-method callme

The command line looks like this:

java …… –javaagent:${home}/ajweaver-agent.jar –Dinstrument-packages=com.sun.example.*,com.sun.allexamples..* –Dcallback-class=SimpleListener –Dcallback-method=callme ……

With these instructions out of way, let us jump right in and try out the agent (and a cool profiler listener class).

Profiling method calls

To get started with the agent, let us look at a custom listener class (com.subhajit.ajprofiler.Profiler) that provides basic profiling of method invocations by tracking the number of times individual methods are called, and the minimum, average and maximum execution time of the invocations. These classes can be configured to optionally setup JMX functionality from which the profile information can be extracted remotely.

As with all agent callback classes, the Profiler class has a public, static method with the signature: public static void {methodName}(int,String,Object,Object..). In Profiler, the method is named “profilerCallback”:

public static void profilerCallback(int op, String methodSignature,
        Object obj, Object... args) {
    switch (op) {
    case 0: {
        final long timestamp = System.nanoTime();
        defaultInstance.begin(obj,methodSignature, timestamp);
        break;
    }

    case 1:
    case 2: {
        final long timestamp = System.nanoTime();
        defaultInstance.end(obj, methodSignature, timestamp);
        break;
    }

    default:
        return;
    }
}

The profilerCallback method treats the two cases when the first parameter is “0” (execution start) and “1” or “2” (execution ended normally or execution ended with uncaught exception) separately. The first case is executed as method bodies are entered, and the other case is executed when method bodies are exited. These cases invoke the “begin” and “end” methods of a default instance of Profiler, respectively.

The “begin” method creates a “CflowKey” object to represent the program flow at this point using the target object (whose instance method body has started executing), the current thread, the method signature and the current timestamp. This object is next appended to a list of such objects dedicated to this method instance. This effectively records the start of execution of this method’s body in the program flow:

public void begin(Object obj, String methodSignature, long timestamp) {
    List<CflowKey> flowList = flows.get(methodSignature);
    if (flowList == null) {
        flowList = new ArrayList<CflowKey>();
        flows.putIfAbsent(methodSignature, flowList);
        flowList = flows.get(methodSignature);
    }
    synchronized (flowList) {
        flowList.add(new CflowKey(obj, Thread.currentThread(),
                methodSignature, timestamp));
    }
}

The “end” method creates a new CflowKey instance using the target object, the current thread and the method signature, and then iterates through the list of stored CflowKey objects for this method signature looking for the last element that matches this CflowKey (with respect to the target object, method signature and current thread) and for which the stored timestamp is less than the timestamp of this CflowKey object. What we are effectively doing in this step is finding the last recorded “start of execution” of this method in this program flow. Given that one object can be executing only one method in the same thread at any time, the last time we recorded the start of this method for the same object and the same thread must correspond to the start of this selfsame flow of execution. Having found this CflowKey item, we record its timestamp, and then remove it from the list of stored CflowKey items.

public void endHelper(Object obj, String methodSignature, long timestamp) {
    List<CflowKey> flowList = flows.get(methodSignature);
    if (flowList == null) {
        return;
    }
    CflowKey searchFlowKey = new CflowKey(obj, Thread.currentThread(),
            methodSignature, timestamp);
    boolean executionTimeHasBeenSet = false;
    long executionTime = 0;
    synchronized (flowList) {
        int index = 0;
        int foundIndex = -1;
        for (CflowKey flowKey : flowList) {
            if (flowKey.getTimestamp() < searchFlowKey.getTimestamp()
                    && flowKey.equals(searchFlowKey)) {
                foundIndex = index;
            }
            index++;
        }
        if (foundIndex != -1) {
            executionTime = searchFlowKey.getTimestamp()
                    - flowList.get(foundIndex).getTimestamp();
            executionTimeHasBeenSet = true;
            flowList.remove(foundIndex);
        }
    }

    if (executionTimeHasBeenSet) {
        // Time to update methodInfoMap.
        MethodExecutionInfo methodExecutionInfo = methodInfoMap
                .get(methodSignature);
        if (methodExecutionInfo == null) {
            methodExecutionInfo = new MethodExecutionInfo();
            methodInfoMap.putIfAbsent(methodSignature, methodExecutionInfo);
            methodExecutionInfo = methodInfoMap.get(methodSignature);
        }
        synchronized (methodExecutionInfo) {
            methodExecutionInfo.update(executionTime);
        }
    }
}

The second part of this method (starting with the line “if (executionTimeHasBeenSet)”), updates a “MethodExecutionInfo” object with the time taken by this execution of the method. The “update” method of the MethodExecutionInfo class updates its internal state using the execution time:

public void update(long executionTime) {
    if (executionTime < minimumExecutionTime) {
        minimumExecutionTime = executionTime;
    }
    if (executionTime > maximumExecutionTime) {
        maximumExecutionTime = executionTime;
    }
    averageExecutionTime = (averageExecutionTime * invocationCount + executionTime)
            / (++invocationCount);
}

Trying it out

To try out this agent and the profiler, download the two files ajweaver-agent.jar and ajweaver-agent-profiler.jar (see the top of this article under "Correction" for the download links) and place them in a temporary directory (say “C:\tmp”). Next, modify the launch script for your application by introducing the following terms (prior to the specification of the class to be executed):

java….-Djavaagent:c:\tmp\ajweaver-agent.jar –Dinstrument-packages={list of packages} –Dcallback-class=com.subhajit.ajprofiler.Profiler –Dcallback-method=profilerCallback …

If this is a long running application (such as a server), also add the term: –Dprofiler-port={a port number}, and you will be able to access profiling information via JConsole. For example, if you entered port “1234”, start JConsole, choose “Remote Process”, enter “localhost:1234” and click on the “Connect” button. Once connected, go to the “MBeans” tab, locate the tree node for “com.subhajit.ajprofiler” and open it out fully. Selecting “Operations”, followed by clicking on the “getData” button, fetches a Map of accumulated profile information from the Profiler class.

To try this with Apache tomcat, make the following changes:

1. Make a copy of the “bin\catalina.bat” file and call it “bin\cataprofiler.bat”.

2. Add the following lines to “bin\cataprofiler.bat”:

set JAVA_OPTS=-Dinstrument-packages=org.apache.catalina..*,org.apache.coyote..*,org.apache.naming..*,org.apache.tomcat..* -javaagent:C:\var\tmp\ajweaver-agent.jar -Dcallback-method=profilerCallback -Dcallback-class=com.subhajit.ajprofiler.Profiler -Xms1024m -Xmx1256m -Dprofiler-port=1234

set CLASSPATH=C:\tmp\ajweaver-agent-profiler\dist\ajweaver-agent-profiler.jar;%CLASSPATH%

changing the package names in the first line to whatever packages you are interested in.

3. In the “bin\setclasspath.bat” file, locate the following stanza and modify it as shown:

rem Set standard CLASSPATH
rem Note that there are no quotes as we do not want to introduce random
rem quotes into the CLASSPATH
if not exist "%JAVA_HOME%\lib\tools.jar" goto noJavac
set CLASSPATH=%JAVA_HOME%\lib\tools.jar;C:\tmp\ajweaver-agent-profiler\dist\ajweaver-agent-profiler.jar

4. Start tomcat using the command: “bin\cataprofiler.bat run”.

5. After tomcat starts up, start JConsole using the command: “jconsole”, which assumes that you have the JDK’s “bin” directory in your path. You should see:

aj1

Select “Remote Process”, enter “localhost:1234” as shown and click on the “Connect” button. This takes you to:

aj2

Select the “MBeans” tab:

aj3

Clicking on the “getData” button pops up a dialog showing accumulated profile data:

aj4

To use a custom GUI program, download ajweaver-agent-profiler-gui.jar. To use it, run “java –jar ajweaver-agent-profiler-gui.jar”. This brings up:

aj5

Click the “Refresh” button to bring up a dialog:

aj6


Enter the hostname and port where the profiler is available. If the profiled application is running on the same host as the GUI, you could enter “localhost” for the host. For the port, enter the value of the “profiler-port” system property that was specified when the application was started. Clicking the “Ok” button brings up:

aj7

Clicking the “Refresh” button refreshes the display.

Other applications

Some applications of this facility that immediately come to mind are method tracing, method logging, heap tracing (dynamically tracing the amount of heap space used by an application as it runs), application health monitoring (trends of method invocations resulting in uncaught exceptions vs. “normal” execution), etc.. At development time, this framework can provide lightweight profiling capabilities.

Saturday Feb 14, 2009

Introduction

I recently completed an experimental project implementing “bound beans”, or a collection of beans which are wired up at run time so that some attribute changes on some of these beans result in other attribute changes on the same or a subset of the other beans. So, assume that had two instances of a hypothetical “Point” class with a couple of integer attributes named “x” and “y”, called “p0” and “p1”. Next, assume that these beans were bound to each other in such a way that setting some value to “p0.x” (by invoking “p0.setX(…)”) resulted in the same value being set into “p1.y”. Symbolically, you could express this binding relationship as “p0.x—>p1.y”. Now, you could throw an instance of a “Line” class into the mix (named “line”) , where the “Line” class contains two “Point” attributes named “start” and “end” (to represent the start and end points of a line, say). Further, you could specify an additional binding rule “p1.y—>line.start.x”. Then, setting “p0.x” results in the new value of this attribute cascading first to “p1.y”, and then to “line.start.x”.

The question is: what could you do with such a system?

Before tackling this question, however, let us look at an approach to implementing bean binding. The inputs to the the binding process are:

1. The bean instances which must be bound.

2. The binding relations (which I call “propagation rules”). The “source” of a propagation rule is the bean instance and its attribute that cause changes to occur in the same or other beans. For example, the “source” of the propagation rule “p0.x—>p1.y” is “p0.x”.

Pilot Implementation

My pilot implementation utilizes a Java agent to instrument some bean classes at load time in such a way that their “setXXX” methods are enhanced. The enhancement consists of invoking a static method call on a named “listener” class, passing the object (on which the “setXXX” method has been called) as a parameter. So, if the “setX” method of the “Point” class was originally:

public void setX(int x){

this.x = x;

}

its enhanced form is:

public void setX(int x){

this.x = x; // This is from the original version

com.subhajit.boundbeans.impl.Listener.attributeChanged(this);

}

The “Listener” class’s “attributeChanged” method knows which bean it is that has just been affected (from its argument, which is the “this” of the bean passed in via the call to “attributeChanged”). But how does it know which attribute of the bean has been affected? It simply creates a dummy RuntimeException, fills in its stack trace, and starts walking the stack trace elements down one by one until it reaches an element that was contributed by the class of the bean. From this element, it extracts the name of the method that was last called on the class of the bean (which, for a “Point” instance is going to be “setXXX”), from which, ultimately extracts the name of the bean attribute that has just been set).

Having determined both the bean instance that was affected and the particular attribute in the affected bean, the Listener class examines the propagation rules, looking for all rules in which the “source” (or the action that causes the rule to fire) is a change of this attribute on this bean. For all such rules, it determines the bean attributes (of the same or other) beans that are bound to the change in this attribute for this bean, and commences to set those attributes one by one. Of course, setting these attributes for the affected beans potentially triggers more updates (if target attribute being set on a target bean happens to be the “source” of another propagation rule).

Needless to say, I provide a facility to check for “cycles” that might intentionally or inadvertently arise in a set of propagation rules. The presence of cycles can potentially lead to “endless” property propagation resulting in an eventual StackOverflowError. For example, if the propagation rules specified are “p0.x—>p1.y” and “p1.y—>p0.x”, then changes to “p0.x” are propagated to “p1.y”, which are, in turn, propagated back to “p0.x”, to start the propagation cycle once again.

Applicability

To be honest, I cannot find myself coming up with a meaningful way to apply this “bean binding” facility. I can think of MVC GUI’s where setting a bean attribute in a model might propagate the value to a view (and vice versa) as one example of use. Can you help me with others?

Friday Feb 13, 2009

This is a test post from Windows Live Writer. Ok, just checked and the test worked. So, let me describe exactly what I had to do to get this working.

First, obviously, download and install Windows Live Writer. Next, fire it up and start configuring it. As for myself, positing to my blogs at Sun required that I first go to my blogs and change my password from its (then) current setting to a new one. (The “default” password that works for Sun employees is apparently their LDAP password. It is a good idea to choose a different password to access your blogs, since authoring tools might choose not to secure it when they pass it along to the server). Next, I went to the “Preferences” page (in my blog), changed to the “Maintenance” sub-tab, and cleared my page load cache.

Continue filling up the entries in the rest of the configuration dialog. That is it! Enjoy!

This blog copyright 2009 by Subhajit Dasgupta