Frank Kieviet
This blog is continued on http://frankkieviet.blogspot.com
A few months ago I decided to look at a different place to host my blog. I compared Google's blogger and wordpress, and finally decided to go with the former.
The full URL is: http://frankkieviet.blogspot.com/ .
Posted at
10:00AM Mar 17, 2009
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/this_blog_is_continued_on
JavaOne 2008
It's a week after JavaOne 2008 now. I finally have time to post a blog. I've been extremely busy for and before JavaOne: not only with the presentations that I gave at JavaOne, but also because the Java CAPS 6 code freeze was the week before JavaOne.
At JavaOne I gave three presentations:
For Java University (the day before JavaOne), I presented a part of Joe Boulenouar's class "How Java EE 5 and SOA Help in Architecting and Designing Robust Enterprise Applications". In my part I covered ESBs, JBI and Composite Applications.
A technical session: TS-5301 Sun Java Composite Application Platform Suite: Implementing Selected EAI Patterns. I presented this with Michael Czapski, a colleague in Sun's field organization in Australia. He's also the author of the book Java CAPS Basics: Implementing Common EAI Patterns. In this session we went over a number of EAI patterns from Hohpe and Woolf's book and showed that when you use the right Integration Middleware, you use these patterns almost without realizing it.
A Birds-of-a-feather session: BOF-6211: Transactions and Java Business Integration (JBI): More Than Java Message Service (JMS). I presented this with Murali Pottlapelli, a colleague in Monrovia. Since there was interest in the slides that we presented, and because unlike Sessions, the slides of BOFs are not made available by the JavaOne organization, you can download the slides of Transactions and JBI: More Than JMS from my blog. I also recorded the sound using my MP3 player, but the quality of the recording is pretty bad. Nevertheless, I've also uploaded the mp3 of Transactions and JBI: More Than JMS.
What's next? Now that CAPS 6 is almost out of the door, we're going to focus on the next release. Even more than in the past, we'll be doing this in open source. More to come!
Posted at
10:46AM May 21, 2008
by Frank Kieviet in Sun |
Comments[4]
Permalink: http://blogs.sun.com/fkieviet/entry/javaone_2008
Server side Internationalization made easy
Last year I wrote a blog entry on my
gripes with Internationalization in Java for server side
components. Sometime in January I built a few utilities for JMSJCA that makes
internationalization for server side components a lot easier. To make
it available to a larger audience, I added the utilities to the tools
collection on http://hulp.dev.java.net.
What do these utilities do?
Generating resource bundles automatically
The whole point was that when writing Java code, I would like to
keep internationalizable texts close to my Java code. Rather than in
resource bundles, I prefer to keep texts in my Java code so that:
- While coding, you don't need to keep switching between a Java file and a resource bundle.
- No more missing messages because of typos in error identifiers; no more obsolete messages in resource bundles
- You can easily review code to make sure that error messages make
sense in the context in which they appear, and you can easily check
that the arguments for the error messages indeed match.
In stead of writing code like this:
sLog.log(Level.WARNING, "e_no_match_w_pattern", new Object[] { dir, pattern, ex}, ex);
Prefer code like this:
sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
, dir, pattern, ex), ex);
Here's a complete example:
public class X {
Logger sLog = Logger.getLogger(X.class.getName());
Localizer sLoc = Localizer.get();
public void test() {
sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
, dir, pattern, ex), ex);
}
}
Hulp has an Ant task that goes over the generated classes and extracts these phrases and writes them to a resource bundle. E.g. the above code results in this resource bundle:
# DO NOT EDIT
# THIS FILE IS GENERATED AUTOMATICALLY FROM JAVA SOURCES/CLASSES
# net.java.hulp.i18ntask.test.TaskTest.X
TEST-E131 = Could not find files with pattern {1} in directory {0}\: {2}
To use the Ant task, add something like this to your Ant script,
typically between <javac>
and <jar>:
<taskdef name="i18n" classname="net.java.hulp.i18n.buildtools.I18NTask" classpath="lib/net.java.hulp.i18ntask.jar"/>
<i18n dir="${build.dir}/classes" file="src/net/java/hulp/i18ntest/msgs.properties" prefix="TEST" />
How does the Ant task know what strings should be copied into the
resource bundle? It uses a regular expression for that. By default it
looks for strings that start with a single alpha character, followed by
three digits followed by a colon, which is this regular expression: [A-Z]\d\d\d: .*.
Getting messages out of resource bundles
With the full English message in the Java code, how is the proper
localized message obtained? In the code above, this is done in this
statement:
sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}", dir, pattern, ex)
public static class Localizer extends net.java.hulp.i18n.LocalizationSupport {
public Localizer() {
super("TEST");
}
private static final Localizer s = new Localizer();
public static Localizer get() {
return s;
}
}
Using the compiler to enforce internationalized code
It would be nice if the compiler could force internationalized
messages to be used. To do that, Hulp includes a wrapper around java.util.logging.Logger that
only takes objects of class LocalizedString
instead of just String.
The class LocalizedString
is a simple wrapper around String.
The Localizer class
produces these strings. By avoiding using java.util.logging.Logger
directly, and instead using net.java.hulp.i18n.Logger
the compiler will force you to use internationalized texts. Here's a
full example:
public class X {
net.java.hulp.i18n.Logger sLog = Logger.getLogger(X.class);
Localizer sLoc = Localizer.get();
public void test() {
sLog.warn(sLoc.x("E131: Could not find files with pattern {1} in directory {0}: {2}"
, dir, pattern, ex), ex);
}
}
Logging is one area that requires internationalization, another is
exceptions. Unfortunately there's no general approach to force
internationalized messages in exceptions. You can only do that if you
define your own exception class that takes the LocalizedString in the
constructor, or define a separate exception factory that takes this
string class in the factory method.
Download
Go to http://hulp.dev.java.net
to download these utilities. The jars (the Ant
task and utilities)
are also hosted on the Maven
repository on java.net.
Posted at
04:32PM Oct 07, 2007
by Frank Kieviet in Sun |
Comments[2]
Permalink: http://blogs.sun.com/fkieviet/entry/server_side_internationaliation_made_easy
Using Nested Diagnostics Contexts in Glassfish
What is a Nested Diagnostics Context?
Let's say that we're writing a message driven bean (MDB) that we'll
deploy on Glassfish. Let's say
that the MDB's onMessage()
method grabs the payload of the message and calls into a stateless
session bean (SLSB) for processing. Let's say that the implementation
of the SLSB
calls into org.apache.xparser:
my.company.MDB > my.company.SLSB > org.apache.xparser
Let's say that he xparser
package may log some warnings if the
payload is not properly formatted. No problem so far. Now let's say
that the application is put in production together with a dozen other
applications and that many of these applications use this library. The
administrator once in a while finds these warnings in Glassfish's server.log:
[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser |_ThreadID=18; ThreadName=ConsumerMessageQueue:(1); |Encoding missing, assuming UTF-8|#]
Let's say that the administrator wants to relay this information to the developer responsible for this application. Using the category name org.apache.xparser.Parser, the administrator can find out what code is responsible (a third party component in this case), but how can the administrator find out which application is responsible for this log output?
One approach is to always log
the application name before calling into the SLSB, so that the
administrator can find the application name using the _ThreadID: he would look at the
_ThreadID of the warning,
then look for a message earlier in the log that has the same _ThreadID that identifies the
application. Not only is this
cumbersome, it's also a big problem that the application now fills up
the log with the application name just in case the SLSB would log
something.
It would be nice if somehow the MDB could associate the thread with
the application name, so that if code downstream logs anything, the log
message will be adorned with the application name:
[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser |_ThreadID=18; Context=Payrollsync; ThreadName=ConsumerMessageQueue:(1); |Encoding missing, assuming UTF-8|#]
In Log4J, this is quite simple using Log4J's NDC class: before the MDB calls
into the SLSB, it would call NDC.push("Payrollsync")
to push the context onto the stack, and after the SLSB it would call NDC.pop(). NDC stands for
Nested Diagnostic Context. It's called nested because it maintains a
stack, so that the SLSB could push another context onto the stack,
hiding the context of the MDB, and pop the context off the stack to
restore the
stack in its original state before returning. Of course each thread has
to have its own stack.
The NDC is a nice facility in Log4J. Unfortunately, in java.util.logging there's no such facility. Let's build one!
Building a Nested Diagnostic Context
The Nested Diagnostics Context will have to keep a stack per thread.
When
the logging mechanism logs something, it needs to peek at the stack
and add the top most item on the stack to the log message. The stack
needs to be accessible somehow by both the application that sets the
context
and by the logging mechanism. A complicating factor in this is that it
needs to work with both delegating-first
and self-first classloaders.
The latter is found in some web applications (special setting in sun-web.xml) and in some JBI
components. Furthermore, we would like to use this mechanism in
Glassfish and avoid having to make changes
to the Glassfish codebase. Lastly, we need to avoid making
changes to the application that would cause the application to be no
longer portable to other application servers.
Logger.getLogger("com.sun.EnterContext").fine("Payrollsync");
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext").fine("Payrollsync");
The loggers com.sun.EnterContext
and com.sun.ExitContext
are special loggers that we'll develop; messages written to these
loggers directly interact with the context stack. Through these special
loggers, this example will result in adding the context to any log
messages that are
produced in the slsb.process(msg)
call. On other application servers without these special loggers, this
will result in logging the
context at FINE level
before and after the call to the SLSB is made, so that one can
associate
a log message using the _ThreadID;
it will not do anything if FINE
logging is turned off.
What if we want to add more than one context parameter to the log message? For instance, what if we want to add the ID of the message that we're processing?
Logger.getLogger("com.sun.EnterContext")
.log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext")
.log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});
The special logger will take the Object[] and push these on the
stack. The message string "{0}={1},
{2}={3}" is there merely for portability: if the the code is
deployed onto an
application server to which we didn't install the NDC facilities, this
will simply log the context parameters at FINE level.
Implementation
In a stand alone Java application, you would simply set your own LogManager and implement the NDC functionality there. Glassfish already comes with its own LogManager, and we don't want to override that. Rather, we want to plug in new functionality without any changes to the existing code base. Here's what we need to do:- create the special loggers com.sun.EnterContext and com.sun.ExitContext
- hookup these special loggers
- hook into the log stream to print out the context
To create the special loggers, we can simply create a new class that
derives from java.util.logging.Logger,
say EntryLogger. Next, we
need to make sure that when someone calls Logger.getLogger("com.sun.EnterContext"),
it will be this class that is returned. Without making any changes to
the LogManager, the way
that that can be accomplished is by instantiating the new EntryLogger and registering it
with the LogManager
immediately. This has to be
done before anybody calls Logger.getLogger("com.sun.EnterContext").
In other words, we should do this before any application starts. In
Glassfish there's an extensibility mechanism called LifeCycleListeners. An object
that implements this interface can be loaded by Glassfish automatically
upon startup.
Lastly, we need to find a way to add the context to the log entries
in the log. Glassfish already has a mechanism to add key-value pairs to
each log entry: when formatting a LogRecord
for printing, Glassfish calls LogRecord.getParameters()
and checks each object in the returned Object[] for objects that
implement java.util.Map
and java.util.Collection.
For objects that implement java.util.Map,
Glassfish adds the key-value pairs to the log message. For objects that
implement java.util.Collection,
Glassfish adds each entry as a String to the log message.
If each LogRecord can
somehow be intercepted before it reaches Glassfish's Formatter, the context can be
added as an extra parameter to the LogRecord's parameter list.
This can be done by adding a new java.util.logging.Handler
to the root-Logger before
Glassfish's own Handler.
For each LogRecord that
this new Handler
receives, it will inspect the Context stack and add a Map with the Context to the LogRecord. Next, the root-Logger will send the LogRecord to Glassfish's own Handler which takes care of
printing the message into the log. Once again, the LifeCycleListener
is the ideal place to register the new Handler.
Give it a spin!
You can download the jar that has these new classes and/or download the sources. Put the jar in Glassfish's lib directory. Restart the server and install the LifeCycleListener:

Posted at
11:49AM Sep 30, 2007
by Frank Kieviet in Sun |
Comments[3]
Permalink: http://blogs.sun.com/fkieviet/entry/using_nested_diagnostics_contexts_in
JavaOne 2007
All of last week I was at JavaOne. It was an exhausting but very
interesting week. Like last year, there were many interesting sessions,
too many to list them here. Let me just mention the one I enjoyed most
was the one by Neal Gafter on Closures
for the Java Programming Language (BOF-2358). I can't wait until
they're in the Java language!
Not only did I attend sessions and BOFs, I also presented BOFs. Three of them to be
precise. I recorded the audio on my MP3 player. Unfortunately the
quality of the audio is pretty bad. I'm posting the audio recordings
below. I'm also posting the slides. Here they are:
BOF8847: Developing Components for Java Business Integration: Binding Components and Service Engines
Presented by Frank Kieviet, Alex Fung, Sherry Weng, and Srinivasan
Chikkala
Attendance: about 100
You cannot cover how to write JBI components in just 45 minutes. We
were also not sure about what the audience was interested in. That's
why we assumed that the audience would consist mostly of people who
have never written a JBI component before, and are relatively new to
JBI. That's why we decided to talk mostly about general information on
JBI and JBI components, and highlight the power of JBI and discuss how
to go about developing one.
As an experiment I wanted to try a new format (at least new for me):
rather than slicing up the session into four parts of 10 minutes, we
cast the session into a "discussion forum". Of course the questions and
answers (and even the jokes) were well rehearsed.
Unfortunately, the audio/visual people that control the meeting
rooms, had forgotten to start the session timer. As a result the audio
was cut unexpectedly just a minute before we could finish up.
Nevertheless, I think it was an interesting session.
Presentation
JavaOne07-BOF8847 (pdf)
BOF8745: Leveraging Java EE in JBI and vice versa
Presented by Frank Kieviet and Bhavanishankara Sapaliga
Attendance: about 60
This BOF was originally to be presented by Vikas Awasthi and
Bhavanishankara Sapaliga, but Vikas couldn't make it, so I replaced
him. We focused the session on how JBI and EE can play together, trying
to make it interesting for both JBI application developers as well as
for EE developers. At the end I ran a demo with NetBeans showing three
different scenarios. The demo-gods were with me: the demo went very
smoothly. Unfortunately I forgot to demo how to add an EJB to a
composite application. Another valuable lesson learned.
Presentation
JavaOne07-BOF8745 (pdf)
BOF9982: The java.lang.OutOfMemoryError: PermGen Space error
demystified
Presented by Edward Chou and Frank Kieviet
Attendance: about 116
This session was on Thursday night at 10pm. That night was the
JavaOne After dark bash. Free beers, music and snacks for everyone.
Therefore we didn't expect much of an attendance: memory leaks are a
rather dry subject, and why leave the party early to go to this
session? Also, some of our thunder had been stolen by SAP who demo-ed a
tool to track memory leaks in a morning-session earlier that week. So
we were quite surprised when about 116 people turned up for our
session. Most stayed until the very end, and there were also quite a
few interesting questions. Apparently a lot of people struggle with
memory leaks in permgen space -- in my presentation I mention that I
get about a hundred hits on my blog every
day from people who search for this memory exception in Google.
Presentation JavaOne07-BOF9982 (pdf)
Posted at
10:01AM May 15, 2007
by Frank Kieviet in Sun |
Comments[1]
Permalink: http://blogs.sun.com/fkieviet/entry/javaone_2007
JavaOne / memory leaks revisited...
Memory leaks in print
A few months ago, Gregg Sporar together with A. Sundararajan started an article on memory leaks in the magazine Software Test & Performance. While writing that, he stumbled upon my blog and decided to cover the "java.lang.OutOfMemoryError: PermGen space" exception too. I offered to collaborate on the article. The article eventually grew so much it was split in two. Part one was published a month ago. Yesterday, part two was published.
Memory leaks at JavaOne
Edward Chou submitted a proposal for a BOF at JavaOne 2007. He and I will be presenting a BOF on the "java.lang.OutOfMemoryError: PermGen space" exception. I'll try to record the session with my MP3 player and post it on my blog.
In preparation for our presentation, we've been looking at some real-life examples of permgen memory leaks. We took a few memory dumps that came from actual customers in actual production environments. We discovered a few more improvements we could make to jhat: it was already fairly simple to track the leaks with jhat; with these changes it becomes really simple. We were actually quite surprised how simple. More on that in a future entry, either on my blog or on Edward's.
More at JavaOne
Speaking about JavaOne... I have my hands full. Next to the memory leaks BOF, I'm also presenting a BOF on JBI ("How to develop JBI components") and I'll be co-presenting another BOF on "EE and JBI."
Posted at
09:56PM May 02, 2007
by Frank Kieviet in Sun |
Comments[11]
Permalink: http://blogs.sun.com/fkieviet/entry/javaone_memory_leaks_revisited
Using Resource Adapters outside of EE containers
Java EE (J2EE) application servers usually
are the ideal choice to host your back-end applications, especially if
your applications require transactions, security or state management.
Running in an application server, your application can easily connect
to external systems provided by Resource Adapters. E.g. if your
application would have to connect to a CRM (e.g. PeopleSoft), to an ERP
(e.g. SAP) or to a system such as JMS, or even to a database (e.g.
Oracle), it would use a Resource Adapter. Through this, the application
would use advanced features such as connection pooling, connection
failure detection and recovery, transactions, security propagation, etc.
As said, EE or J2EE application servers are ideal containers for
hosting business side logic. But it's not the best choice in literally
every situation. There are
cases where you would prefer to write your
application as a stand alone Java program. There's nothing strange
about that: you should always use the best tools for the job at hand,
and no single tool is best in all situations.
Now say that you need to write a stand-alone Java application, and
in that application you would need to connect to an external system.
Wouldn't it be nice to be able to use an off-the-shelf Resource Adapter
for this connectivity, so that you would not have to hand-code features
such as connection pooling, connection failure detection and recovery
etc? As I will show, this is not that difficult.
Hacking a Resource Adapter
Resource Adapters are distributed in RAR files, i.e. a file with a .rar extension. A RAR is
nothing more than a ZIP file. When you open up a RAR, you will see a
bunch of jars and a descriptor file with the name ra.xml. In a nutshell, this is
what you need to do to use a Resource Adapter:
- Add the jars to your application's classpath
- Instantiate the Resource Adapter classes (you can find out which
ones by
examining the ra.xml)
- Configure the Resource Adapter by calling a few setter methods (you can find which ones by examining the ra.xml)
- Activate the Resource Adapter (you can find out how by reading the Java Connector API specification, or just read on)
These four steps are basically what the application server does when
it loads a Resource Adapter. There's a lot of logic involved in this,
but in a stand-alone Java application we can take a lot of short-cuts
so that the logic that we need to write is not too involved.
There's quite a big difference in how a Resource Adapter is used to
provide outbound connectivity
versus inbound connectivity.
With outbound connectivity, I
mean a situation where an application obtains a connection to an
external system and
reads or writes data to it. With inbound
I mean a situation where the Resource Adapter listens for events from
the external system and calls into
your application when such an event
occurs.
As an example we will use the JMSJCA
resource adapter. This is an open-source adapter for JMS
connectivity to various JMS servers.
Outbound connectivity
In a nutshell what we need to do is instantiate the Resource Adapter
class, configure it, instanatiate a Managed Connection Factory, and
from that obtain the application-facing connection factory.
When you open up the ra.xml
file, you can find out class implements the ResourceAdapter interface:
<resourceadapter>
<resourceadapter-class>com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter</resourceadapter-class>
The Resource Adapter class has per the specification a no-args
constructor
and should implement the javax.resource.spi.ResourceAdapter
interface. You could instantiate the ResourceAdapter simply by doing
this:
com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();
The drawback of this is that, although the chances of that being
small, if the classname changes in future versions of the Resource
Adapter, your
application would no longer compile. A better approach would be to read
the classname dynamically from the ra.xml. I'll leave that as "an
excercise for the reader".
The Resource Adapter is a Java Bean with getters and setters. This
is how you can configure the Resource Adapter. For example, we could
set the connection URL as follows:
ra.setConnectionURL("stcms://localhost:18007");
Next, we need to instantiate a ManagedConnectionFactory.
Again from the ra.xml,
you can find the
classname:
<outbound-resourceadapter>
<connection-definition>
<managedconnectionfactory-class>com.stc.jmsjca.core.XMCFUnifiedXA</managedconnectionfactory-class>
Again, the ManagedConnectionFactory
must have a no-arg constructor
and is a Java bean, so you can simply instantiate and configure one as
follows:
com.stc.jmsjca.core.XMCFUnifiedXA mcf = new com.stc.jmsjca.core.XMCFUnifiedXA();
mcf.setUserName("Administrator");
mcf.setPassword("STC");
Next, you may need to associate the newly created ManagedConnectionFactory with
the ResourceAdapter.
Those that require this association
(most likely all of them), implement the javax.resource.spi.ResourceAdapterAssociation
interface. This leads to the following code:
mcf.setResourceAdapter(ra);
Lastly, you create the application-facing connection factory. In
case of JMS, this is a javax.jmx.ConnectionFactory.
You can find evidence of this in the ra.xml:
<outbound-resourceadapter>
<connection-definition>
<managedconnectionfactory-class>com.stc.jmsjca.core.XMCFUnifiedXA</managedconnectionfactory-class>
...
<connectionfactory-interface>javax.jms.ConnectionFactory</connectionfactory-interface>
<connectionfactory-impl-class>com.stc.jmsjca.core.JConnectionFactoryXA</connectionfactory-impl-class>
<connection-interface>javax.jms.Connection</connection-interface>
</connection-definition>
This leads to the following code:
javax.jms.ConnectionFactory f = (javax.jms.ConnectionFactory) mcf.createConnectionFactory();
Now, putting it all together, this is what you would need to create
a JMS connection factory from the JMSJCA Resource Adapter:
com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();
com.stc.jmsjca.core.XMCFUnifiedXA mcf = new com.stc.jmsjca.core.XMCFUnifiedXA();
ra.setConnectionURL("stcms://localhost:18007");
ra.setUserName("Administrator");
ra.setPassword("STC");
ra.setOptions("JMSJCA.NoXA=true");
mcf.setResourceAdapter(ra);
javax.jms.ConnectionFactory f = (javax.jms.ConnectionFactory) mcf.createConnectionFactory();
And that's all there's to it
As I mentioned, one of the advantages of using a Resource Adapter
over using a client runtime directly, is that an Resource Adapter
typically provides connection pooling and other nifty features. JMSJCA
for instance provides a powerful and configurable connection manager
with blocking behavior, time-out behavior, connection failure
detection, etc. It even enlists the connection in the transaction if it
detects that there is a transaction active when the connection is
created.
Not all resource adapters will provide such a comprehensive
connection manager: check the documentation of the resource adapter
that you're planning to use. If it doesn't provide a connection manager
to your liking, you can provide your own connection manager. If you
need to write one, take a look at the one in JMSJCA: you may want to
use it as a starting point.
Inbound connectivity
Inbound connectivity is where a resource adapter is used to receive
messages from an external system; the resource adapter delivers these
messages to a Message Driven Bean. In the case of JMS, this is javax.jms.MessageListener, with
its void
onMessage(javax.jms.Message) method. For other types of resource
adapers, you will find other Message Driven Bean interfaces. Look in
the ra.xml to find out
which one.
Seting up inbound connectivity is a bit more involved than outbound
connectivity. That is because with inbound, you need to explain to
Resource Adapter how it should obtain a new instance of the Message
Driven Bean, and how to obtain a thread that will call the onMessage() method or
equivalent method.
We start with instantiating and configuring a ResourceAdapter object; the
class can be found in ra.xml
as I showed in the Outbound
Connectivity section:
com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter ra = new com.stc.jmsjca.unifiedjms.RAUnifiedResourceAdapter();
ra.setConnectionURL("stcms://localhost:18007");
ra.setUserName("Administrator");
ra.setPassword("STC")
Next, we need to call start()
on the ResourceAdapter
object. This method takes a javax.resource.spi.BootstrapContext
object. You need to provide an implementation for this class; the most
important method that you need to implement is the public
javax.resource.spi.work.WorkManager getWorkManager() method. As
you can see, this method should return a WorkManager object. This class
has a number of methods that provide access to a threadpool. It would
help at this point if you have a bit of knowledge of the internals of
the Resource Adapter that you're trying to work with: some Adapters
don't use a WorkManager,
and the ones that do, typically only use one of the methods on the WorkManager object. For
instance, here's a WorkManager
that could be used with JMSJCA:
public class XWorkManager implements WorkManager {
private java.util.concurrent.Executor mPool;
public XWorkManager(int poolsize) {
mPool = new PooledExecutor(new LinkedQueue(), poolsize);
}
public void scheduleWork(Work work) throws WorkException {
try {
mPool.execute(work);
} catch (InterruptedException e) {
throw new WorkException(e);
}
}
// other methods just throw an exception
}
Fortunately we can make use
of the java.util.concurrent
tools introduced in JDK 5.0 for a threadpool. Next, we need to provide
an implementation for javax.resource.spi.endpoint.MessageEndpointFactory.
This is an interface with only two methods. One is there to indicate if
the message delivery should be transacted, and the other one to is
there to create a MessageEndpoint.
This is a class that is a proxy around the Message Driven Bean. This
proxy should implement the onMessage()
or equivalent method which should simply delegate to the MessageListener or equivalent
object in your application. The proxy should also implement three
additional methods:
void afterDelivery()
void beforeDelivery(Method method)
void release()
The beforeDelivery()
and afterDelivery()
methods are called just before the Resource Adapter calls the onMessage() or equivalent
method. You could start and commit a transaction in these methods; if
you're not using transactions, you can just leave these methods
unimplemented. The release()
method is called by the Resource Adapter when it's done using a Message
Driven Bean. If you don't implement some sort of pooling mechanism for
Message Driven Beans, and you probably won't, you can leave this method
empty as well.
Now that you have implementations for the BootstrapContext, WorkManager, MessageEndpointFactory, and MessageEndpoint, you can finally tell the Resource Adapter to start delivering messages to your Message Driven Bean. You do that by calling the void endpointActivation(javax.resource.spi.endpoint.MessageEndpointFactory, javax.resource.spi.ActivationSpec) method on the ResourceAdapter. The ActivationSpec object is implemented by the Resource Adapter; you can find the classname in the ra.xml file. It is a Java bean that is used to configure the message delivery. For instance, in the case of JMS, you specify from which queue or topic to get messages.
As you can see, the inbound connectivity case is not as straight
forward as the outbound connectivity case, but still very much doable.
Real examples
For a full sample source listing, take a look at the test suite in JMSJCA You can also look at the
JMSBC
as part of the Open
JBI Components project. This Binding Component uses the JMSJCA
Resource Adapter as described in this blog entry. By doing so, a lot of
development effort was saved dealing with all the idiosyncracies of
various JMS server implementations, connection management, etc.
Posted at
08:03PM Feb 04, 2007
by Frank Kieviet in Sun |
Comments[10]
Permalink: http://blogs.sun.com/fkieviet/entry/using_resource_adapters_outside_of
JMSJCA, a feature rich JMS Resource Adapter, is now a java.net project
JMSJCA is now a java.net project. It can be found here: http://jmsjca.dev.java.net.
The project is currently used in Java CAPS, JMS Grid and the JMS BC as part of the open-jbi-components project.
The connector can be used as a J2EE 1.4 Resource Adapter, but its libraries can also be used as an abstraction layer to JMS servers from non J2EE-code. As such, the adapter acts like a library that hides the complexities of transactions, concurrency, connection failure detection, JMS server implementation idiosyncracies, etc. That is how it is used in the JMS BC as part of the open-jbi-components project.
Posted at
10:22AM Jan 21, 2007
by Frank Kieviet in Sun |
Comments[19]
Permalink: http://blogs.sun.com/fkieviet/entry/jmsjca_a_feature_rich_jms
Logless transactions
A few months ago, in my blog entry Transactions,
disks, and performance
I went into the importance of minimizing the number of writes.
Transaction logging is one of those cases where minimizing the number
of writes greatly enhances performance. In this entry, I'll describe a
way to avoid transaction logging altogether.
What is transaction logging? Transaction logging
refers to persisting the state of a two-phase transaction so that in
the event of a crash, the transaction can either be committed or rolled
back (recovered). I won't go into the details of what XA is; more
information about XA transactions can be found elsewhere, e.g. in Mike Spille's XA
Exposed.
Let me illustrate what recovery is using a
"diagram".
Consider an XA two phase transaction with three Resource Managers (RMa, RMb, and RMc). To
indicate what happens at what time, I'll put all actions in a table;
each row corresponds to a different time.
| time |
RMa |
RMb |
RMc |
Coordinator |
| t1 |
start(xid1a,
TMNOFLAGS) |
|||
| t2 |
start(xid1b, TMNOFLAGS) | |||
| t3 |
start(xid1c, TMNOFLAGS) | |||
| t4 |
end(xid1a, TMSUCCESS) |
|||
| t5 |
end(xid1b, TMSUCCESS) | |||
| t6 |
end(xid1c, TMSUCCESS) | |||
| t7 |
prepare(xid1a) |
|||
| t8 |
prepare(xid1b) | |||
| t9 |
prepare(xid1c) | |||
| t10 |
log | |||
| t11 |
commit(xid1a, false) |
|||
| t12 |
commit(xid1b, false) | |||
| t13 |
commit(xid1c, false) | |||
| t14 |
delete from log |
At t10
the transaction manager records the decision
to commit to the log.
Let's say that the system crashes after t10, say between t11 and t12. When the system restarts,
it will call recover() on
all known Resource Managers and it will read the transaction log. In
the transaction log it will find that xid1x was marked for
commit.
Through recover() it will
find that xid1b
and xid1c are in doubt. It knows that these two
need to be committed because of the commit decision in the log.
What happens if the system crashes before the
commit decision is written to the log, for example between t8 and t9? Upon recovery, the recover() method of RMa, RMb and RMc return xid1a and xid1b (but not xid1c because
prepare was not
called on RMc
yet). The
transaction manager will rollback RMa
and RMb
because no commit
decision was found in the log.
SeeBeyond's Logless XA Transactions
Let's take a look at the recover() method on the XAResource. This method returns
an array of Xid objects.
Each Xid object holds two
byte[]
arrays. These two arrays represent the global transaction ID and the
branch
qualifier. They are typically random numbers picked by the transaction
manager. The Resource Managers that receive these Xids should use these objects
as identifiers and return them in the recover() method unmodified.
At SeeBeyond, Jerry Waldorf and Venugopalan
Venkataraman came up with an idea to use the storage space in the byte[] arrays of the Xid as a way to persist the
transaction state. Here's how it works. Let's modify the above
example by removing transaction logging:
| time |
RMa |
RMb |
RMc |
Coordinator |
| t1 |
start(xid1a,
TMNOFLAGS) |
|||
| t2 |
start(xid1b, TMNOFLAGS) | |||
| t3 |
start(xid1c, TMNOFLAGS) | |||
| t4 |
end(xid1a, TMSUCCESS) |
|||
| t5 |
end(xid1b, TMSUCCESS) | |||
| t6 |
end(xid1c, TMSUCCESS) | |||
| t7 |
prepare(xid1c) |
|||
| t8 |
prepare(xid1b) | |||
| t9 |
prepare(xid1a) | |||
| t10 |
commit(xid1c, false) | |||
| t11 |
commit(xid1b, false) | |||
| t12 |
commit(xid1a, false) |
A commit decision is still
being made, but this decision is no longer
persisted in a separate transaction log. In stead, it is persisted in xid1a. If the system
finds xid1a
upon recovery, it knows
that a commit decision was made. If it doesn't find xid1a, it knows that
a commit
decision was not made. Note that the order in which both prepare and commit are called on the three
Resource Managers is very important.
As in the first example, if the system crashes before a commit
decision has been made, it will rollback any resources upon recovery.
E.g. if the system crashes between t8 and t9, it will encounter xid1c and xid1b and will call rollback() on these because it
cannot find a record of a commit-decision for xid1, i.e. it cannot find xid1a. Hence, xid1b and xid1c need to be
rolled back.
If the system crashes after a commit decision has been made, for
example between t10 and t11, it will find xid1b and xid1a. Since xid1a signifies a
commit
decision, both xid1b
and xid1a
should be committed.
So far so good. But how does the transaction manager know that if it
encounters xidb
it should
look for xida to figure out if a
commit
decision was made? This is where the transaction manager uses the byte[] of the Xid: it stores this information
in one of them.
Complicating factors
A problem in this scheme occurs when the prepare(xid1a) method returns XA_RDONLY. If that happens, commit(xid1a, false)
cannot be
called, and RMa will not return xid1a
upon calling recover().
Recall that xid1a had special significance!
Hence it is important to order the Resource Managers such that the
first one on which prepare()
is called, is both reliable and will not return XA_RDONLY. However, in normal
EE applications, the application prescribes in which order resources
are enlisted in a transaction. Hence, to use this logless transaction
scheme, the application server either needs to be extended with a way
to
specify resources a priori, or the application server needs to be
extended with a learning capability so that it knows which resources
are enlisted in a particular operation so that it can pick the right
resource manager to write the commit decision to.
The SeeBeyond logless transaction approach is one of the ways that
transaction logging can be made less exensive. In a future blog, I'll
cover additional ones.
Posted at
11:58AM Jan 15, 2007
by Frank Kieviet in Sun |
Comments[2]
Permalink: http://blogs.sun.com/fkieviet/entry/logless_transactions
Short note: Running Java CAPS on Java SE 6
Today Java SE 6 was
released. It comes with many new features and cool tools. One of them
being jmap as described in a previous log on permgen
exceptions. The Integration Server is not officially supported on
SE 6 yet. However, if you want to run the Java CAPS integration server
on SE 6, this is what you can do:
- Install JDK 6 somewhere, e.g. c:\java
- Install the IS somewhere, e.g. c:\logicalhost
- Rename c:\logicalhost\jre to c:\logicalhost\jre.old
- Copy c:\java\jre1.6.0 to c:\logicalhost\jre
- Copy c:\java\jdk1.6.0\lib\tools.jar to c:\logicalhost\jre\lib
- Copy c:\logicalhost\jre\bin\javaw.exe to c:\logicalhost\jre\bin\is_domain1.exe
- Copy c:\logicalhost\jre\bin\javaw.exe to c:\logicalhost\jre\bin\ isprocmgr_domain1.exe
- Edit c:\logicalhost\is\domains\domain1\config\domain.xml and comment out these lines:
<!--
<jvm-options>-Dcom.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager=com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager</jvm-options>
<jvm-options>-Dorg.xml.sax.driver=com.sun.org.apache.xerces.internal.parsers.SAXParser</jvm-options>
<jvm-options>-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl</jvm-options>
<jvm-options>-Dcom.sun.org.apache.xerces.internal.xni.parser.XMLParserConfiguration=com.sun.org.apache.xerces.internal.parsers.XIncludeParserConfiguration</jvm-options>
<jvm-options>-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl</jvm-options>
<jvm-options>-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl</jvm-options>
<jvm-options>-Djavax.xml.soap.MessageFactory=com.sun.xml.messaging.saaj.soap.ver1_1.SOAPMessageFactory1_1Impl</jvm-options>
<jvm-options>-Djavax.xml.soap.SOAPFactory=com.sun.xml.messaging.saaj.soap.ver1_1.SOAPFactory1_1Impl</jvm-options>
<jvm-options>-Djavax.xml.soap.SOAPConnectionFactory=com.sun.xml.messaging.saaj.client.p2p.HttpSOAPConnectionFactory</jvm-options>
-->
i.e. add <!-- before and add --> after these lines.
Also comment out this line:
<!-- <jvm-options>-server</jvm-options>-->
After these changes you can run the Integration Server with Jave SE 6. These are not “official” recommendations (as mentioned, there’s no support for SE 6 just yet); also the lines commented out are optimizations, that need to be re-established for SE 6 tet, so don’t do any performance comparisons just yet.
Posted at
10:41PM Dec 11, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/short_note_running_java_caps
Using java.util.logging in BEA Weblogic
Some weeks ago I was making sure that the JMSJCA connector runs
properly on BEA Weblogic 9.1. Of course I ran into some issues: never
assume that your code is really portable until you try it out. I also
found some issues in Weblogic. Now I
don't want to make this into a bashing session of Weblogic: since I
work for Sun Microsystems, and Weblogic can be considered a competitor
or Glassfish, that would be in bad taste. However, there's one thing
that bugged me a lot and for which I want to share a solution: logging.
Dude, where's my log output?
At SeeBeyond (and of course at Sun) we standardized on the use of java.util.logging. You could
argue whether java.util.logging
is technically the best solution, but since it's a standard it's a lot
better if you use that package than it is to write your own or use
other
third party logging tools. Standardization gives portability! So,
JMSJCA, STCMS, etc all use java.util.logging.
When I started to run my test suite on Weblogic, I was surprised
that the log output from the tests did not end up in one of the
Weblogic log
files. Instead, output
appeared on the console. Surely there was a configuration option
somewhere in Weblogic to fix this, right? I couldn't find one. The
problem became even more annoying when I tried to enable debug logging.
In Glassfish you go to the logging configuration screen, and you simply
type the name of the logging category that you want to change, select
the level and that's it. Not so in Weblogic. What was going on? It was
time to look a little deeper into this.
How Weblogic uses java.util.logging.
I must admit that the Weblogic documentation is good, and very
easily accessible. I wish we had the same situation at SeeBeyond.
Anyway, from the documentation it appeared that Weblogic fully supports
java.util.logging: I quote
from the documentation:
To distribute messages, WebLogic Server supports Java based
logging by default. The LoggingHelper
class provides access to the java.util.logging.Logger
object used for server logging. [snip] If your application is
configured for Java Logging or Log4j, in
order to publish application events using WebLogic logging services,
create a custom handler or appender that relays the application events
to WebLogic logging services using the message catalogs or Commons API.
Eh, what do you mean... LoggingHelper
class? Can't I just use java.util.logging
without depending on any Weblogic classes? Do I have to write custom code?
And how do I change log levels dynamically? Isn't there an option in
the admin console to do that? An MBean perhaps? I was really surprised
to find this recommendation:
If you use the DEBUG severity level, BEA recommends that you
create a "debug mode" for your application. For example, your
application can create an object that contains a Boolean value. To
enable or disable the debug mode, you toggle the value of the Boolean.
Then, for each DEBUG message, you can create a wrapper that outputs the
message only if your application's debug mode is enabled.
For example, the following code can produce a debug message:
private static boolean debug =
Boolean.getBoolean("my.debug.enabled");
if (debug) {
mylogger.debug("Something
debuggy happened");
}
[snip]
To enable your application to print this message, you include the
following Java option when you start the application's JVM:
-Dmy.debug.enabled=true
That was a good recommendation... TEN YEARS AGO!
A fix
Fortunately, things are not as grim as they look. It's quite easy to
write a little bit of code that is deployed globally independently of
your application, so that you can continue to use java.util.logging without
sprinkling Weblogic dependencies all over your application. First of
all, what's all this with this BEA logger? As it turns out, during
startup, Weblogic instantiates a java.util.logging.Logger
object and hooks a number of Handler
objects to it. So if you log to that particular logger, your log output
will appear in one of the Weblogic log files. Can't you get access
to this Logger using Logger.getLogger(name)? No you
cannot: Weblogic is not using the LogManager at all. That particular
logger is not registered in the LogManager.
That is the reason why you need to use the weblogic.logging.LoggingHelper.getServerLogger()
method.
Once we have a reference to this special Logger object, we can get
a list of the Handler
objects. Next, we can assign this list to the root logger object. As a
result, whenever you use a Logger
object obtained through Logger.getLogger(),
the output will go to the Weblogic handlers. Problem solved!
Of course we want to centralize this code: afterall we don't want to
introduce this dependency in our application code. There are two ways
of doing this: we can write our own LogManager and set that as the
JVM's LogManager
singleton. We can do this because Weblogic doesn't set a LogManager. The other approach
is to write a Weblogic startup class. An instance of this class is
created at server startup and its startup()
method is called. This is the approach I've taken:
package com.sun.bealog;
import java.util.Hashtable;
import java.util.logging.Handler;
import java.util.logging.Level;
import java.util.logging.LogManager;
import java.util.logging.Logger;
import javax.management.MBeanServer;
import javax.management.ObjectName;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;
import weblogic.common.T3ServicesDef;
import weblogic.common.T3StartupDef;
import weblogic.logging.LoggingHelper;
/**
* Allows logging by applications and components using the java.util.logging
* package by sending output from java.util.logging.Logger-s to the
* WebLogic trace file.
*
* @author fkieviet
*/
public class BeaLog implements T3StartupDef {
public String startup(String string, Hashtable hashtable) throws Exception {
String result = "";
Logger wllogger = LoggingHelper.getServerLogger();
Handler[] wlhandlers = wllogger.getHandlers();
// Change log level on log file handler
for (int i = 0; i < wlhandlers.length; i++) {
if (wlhandlers[i].getLevel().intValue() >= 530) {
result += "log level changed of " + wlhandlers[i] + "; ";
wlhandlers[i].setLevel(Level.FINEST);
}
}
// Copy handlers from wllogger to rootlogger
Logger rootlogger = Logger. getLogger("");
Handler[] toremove = rootlogger.getHandlers();
for (int j = 0; j < toremove.length; j++) {
rootlogger.removeHandler(toremove[j]);
}
for (int i = 0; i < wlhandlers.length; i++) {
rootlogger.addHandler(wlhandlers[i]);
}
result += "root handler now has " + rootlogger.getHandlers().length + " handlers; ";
// Register wllogger so that it can be manipulated through the mbean
boolean alreadythere = !LogManager.getLogManager().addLogger(wllogger);
// Register mbean
InitialContext ctx = null;
String[] names = new String[] {"java:comp/jmx/runtime", "java:comp/env/jmx/runtime"};
for (int i = 0; i < names.length; i++) {
try {
ctx = new InitialContext();
MBeanServer mbeanServer = (MBeanServer) ctx.lookup(names[i]);
mbeanServer.registerMBean(LogManager.getLoggingMXBean(),
new ObjectName("java.util.logging:type=Logging"));
result += "mbean registered; ";
} catch (Exception e) {
// ignore
} finally {
safeClose(ctx);
}
}
// For Java CAPS: disable two commonly used loggers that are used to
// relay call context information
Logger.getLogger("com.stc.EnterContext").setLevel(Level.OFF);
Logger.getLogger("com.stc.ExitContext").setLevel(Level.OFF);
return this.getClass().getName() + result;
}
private void safeClose(Context ctx) {
if (ctx != null) {
try {
ctx.close();
} catch (NamingException ignore) {
// ignore
}
}
}
public void setServices(T3ServicesDef t3ServicesDef) {
}
}
As you may have noticed, there's another problem that's solved in
the code snippet above: dynamice log level configuration. The JVM's
logging package ships with an MBean that allows you to set the log
levels dynamically. The only thing that needs to be done is to register
this MBean in Weblogic's MBeanServer
(Weblogic has an
unusual way to access its MBeanServer
as you can see in the code). From then on, you can use a tool like jmx-console to access the
logging mbean and set the log levels.
Download
a NetBeans project that contains the BeaLog code.
Posted at
02:16PM Nov 16, 2006
by Frank Kieviet in Sun |
Comments[9]
Permalink: http://blogs.sun.com/fkieviet/entry/using_java.util.logging_in_bea_weblogic
More on... How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)
I got quite a few comments on my last blog (How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)). Apparently more people have been struggling with this problem.
Why bring this up? What's the news? Edward Chou continued to
explore options to diagnose classloader leaks. First of all, he
explored how to generate a list of orphaned classloaders with jhat. An orphaned classloader
is a classloader that is not referenced by any object directly but
cannot be garbage collected. The thinking behind this is that programs
that create classloaders (e.g. application servers) do maintain
references to them. So if there's a classloader that is no longer directly referenced, this
classloader is probably a leak. Read about it on his blog (Find
Orphaned Classloaders).
Still we were not satisfied: when examining some memory dumps from
code that we were not familiar with, we explored yet some other options
to diagnose classloader leaks: duplicate classes and duplicate
classloaders. Let me explain.
Let's say that your application has a com.xyz.Controller class. If
you find many instances of this
class object, you likely have a classloader leak. Note the
phrase "instances of this class object".
What I mean by this: the class com.xyz.Controller
is loaded multiple times, i.e. multiple instances of the com.xyz.Controller.class
are present. You can use jhat
to run this query: simply list all instances of java.lang.Class.
Edward modified jhat
to generate a list of all classloader instances that have an identical
set of classes that it loaded. Typically there's no reason why someone
would create two classloader instances and load exactly the same set of
classes into them. If you find any in your memory dump, you should get
suspicious and take a closer look. Monitor Edward's blog for more
details on this.
One more thing: Edward found out that the method java.lang.String.intern()
allocates memory in PermGen space. So if your application frequently
uses this method with different strings, watch out. Fortunately these
strings are subject to garbage collection. But if your application
holds references to these strings, thereby making garbage collection
impossible, your application may cause the dreaded "java.lang.OutOfMemoryError: PermGen
space" exception. No classloaders involved this time.
Posted at
10:29PM Nov 15, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/more_on..._how_to_fix
How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)
In the previous blog entry Classloader leaks: the dreaded "java.lang.OutOfMemoryError: PermGen space" exception I explained how this type of problem can originate in the application code that you deploy to an application server. In this post I'll explain how to track down the leak so that you can fix it.
Profilers
Memory leak? Use a profiler. Right? Well... generally speaking the answer is yes, but classloader leaks are a bit special...
To refresh your memory (pardon the pun), a memory leak is an object that the system unintentionally hangs on to, thereby making it impossible for the garbage collector to remove this object. The way that profilers find memory leaks is to trace references to a leaked object.
What do I mean by "tracing"? A leaked object can be referenced by another object which itself is a leak. In turn, this object may also be a leak, and so on. This repeats until an object is found that references a leaked object by mistake. This reference is where the problem is, and what you need to fix. Let me try to clarify this by illustrating this with a picture from my previous blog:
In this picture the AppClassloader, LeakServlet.class, STATICNAME, CUSTOMLEVEL, LeakServlet$1.class are all leaked objects. Due to static objects (e.g. STATICNAME) in the picture, that may in turn reference other objects, the number of leaked objects may be in the thousands. Going over each leaked object manually to check if there are any incidental references to it (the red reference in the picture) until you find the troublesome object (CUSTOMLEVEL) is laborious. You would rather have a program find the violating reference for you.
A profiler doesn't tell you which leaked object is interesting to look at (CUSTOMLEVEL). Instead it gives you all leaked objects. Let's say that you would look at STATICNAME. The profiler now should find the route STATICNAME to LEAKSERVLET.class to AppClassloader to LeakServlet1$1.class, to CUSTOMLEVEL to Level.class. In this route, the red line in the picture is the reference that actually causes the leak. I said the profiler should find this route. However, all the profilers that we tried, stop tracing as soon as they reach a class object or classloader. There's a good reason for that: the number of traces grows enormous if it follows through the references through classes. And in most cases, these traces are not very useful.
So no luck with profilers! We need to try something else.
JDK 6.0 to the rescue
When Edward Chou and I worked on tracking down classloader leaks last year, we tried to run the JVM with HPROF and tried to trigger a memory dump; we looked at using Hat to interpret the dump. Hat stands for Heap Analysis Tool, and was developed to read dump files generated with HPROF. Unfortunately, the hat tool blew up reading our dump files. Because we didn't think it was difficult to parse the dump file, we wrote a utility to read the file and track the memory leak.
That was last year. This year we have JDK 6.0; this new JDK comes with a few tools that make looking at the VM's memory a lot simpler. First of all, there's a tool called jmap. This command line tool allows you to trigger a dump file without HPROF. It is as simple as typing something like:
jmap -dump:format=b,file=leak 3144
Here leak is the filename of the dump, and 3144 is the PID of the process. To find the PID, you can use jps.
Secondly, Hat is now part of the JDK. It is now called jhat. You can run it using a command line like:
jhat -J-Xmx512m leak
Here leak is the name of the dump file, and as you may have guessed, -J-Xmx512m is a parameter to specify how much memory jhat is allowed to allocate.
When you start jhat it reads the dump file and then listens on an HTTP port. You point your browser to that port (7000 by default) and through that you can browse the memory heap dump. It's a very convenient way of looking at what objects are in memory and how they are connected.
So, it seemed like a good idea to check out what can be done with these new tools to find classloader leaks.
... or not?
Unfortunately, jhat, just like the profilers we tried, also stops tracing when it encounters a class. Now what? I decided to download the JDK source code and find out what the problem is. Building the whole JDK is a difficult task from what I gather from the documentation. Fortunately, jhat is a nicely modularized program; I could just take the com.sun.tools.hat-packages out of the source tree, load them in my favorite editor and compile the code. The patched code was easily packaged and run: I just jar-ed it and added it to the lib/ext directory of the JDK:
jar -cf C:\apps\Java\jdk1.6.0\jre\lib\ext\ahat.jar -C hat\bin .
jhat leak
This was really as easy as pie. So after running the program in the debugger for some time, I figured out how it works and what changes I wanted to make. The change is that when you follow the references from a classloader, the modified jhat will follow through all traces from all the instances of the classes that it loaded. With that change, finding the cause of a classloader leak is simple.
An example
Let's look at the example from my previous blog as depicted in the picture above. Using NetBeans I created the following servlet and deployed it to Glassfish:
1 package com.stc.test; 2 3 import java.io.*; 4 import java.net.*; 5 import java.util.logging.Level; 6 import java.util.logging.Logger; 7 import javax.servlet.*; 8 import javax.servlet.http.*; 9 10 public class Leak extends HttpServlet { 11 12 protected void processRequest(HttpServletRequest request, HttpServletResponse response) 13 throws ServletException, IOException { 14 response.setContentType("text/html;charset=UTF-8"); 15 PrintWriter out = response.getWriter(); 16 out.println("<html><body><pre>"); 17 Level custom = new Level("LEAK", 950) {}; 18 Logger.getLogger(this.getClass().getName()).log(custom, "New level created"); 19 out.println("</pre></body></html>"); 20 out.close(); 21 } 22+ HTTPServlet methods. Click on the + sign on the left to edit the code 48 } 49
I invoked the servlet to cause the leak. Next I undeployed the servlet. Then I triggered a heap dump:
jmap -dump:format=b,file=leak 3144
and fired up the modified jhat:
jhat -J-Xmx512m leak
and brought up the browser. The opening screen shows amongst other things, all classes that are found in the dump:
Finding objects that were leaked is easy since I know that I shouldn't see any objects of the classes that I deployed. Recall that I deployed a class com.stc.test.Leak; so I searched in the browser for the com.stc.test package, and found these classes (never mind the NoLeak class: I used it for testing).
Clicking on the link class com.stc.test.Leak brings up the following screen:
Clicking on the classloader link brings up the following screen:
Scrolling down, I see Reference Chains from Rootset / Exclude weak refs . Clicking on this link invokes the code that I modified; the following screen comes up:
And there's the link to java.util.Logging.Level that we've been looking for!
Easy as pie!
Summarizing, the steps are:
- undeploy the application that is leaking
- trigger a memory dump
- run jhat (with modification)
- find a leaked class
- locate the classloader
- find the "Reference chains from root set"
- inspect the chains, locate the accidental reference, and fix the code
I'll contact the JDK team to see if they are willing to accept the changes I made to jhat. If you cannot wait, send me an email or leave a comment.
Update (April 2007): Java SE SDK 6.0 update 1 has the updated code.
Other Permgen space tidbits
After fixing the classloader leak, you of course want to test to see if the memory leak has disappeared. You could again trigger a memory dump and run jhat. What you also could try is to see if the amount of used permgen space memory goes up continuously after each deployment/undeployment of your application.
You can monitor permgen space usage using jconsole. You can see the memory usage go up when you repeatedly deploy and undeploy an application. However, this may not be a classloader / memory leak. As it turns out, it's difficult to predict when the garbage collector cleans up permgen space. Pressing the button in Run GC in jconsole does not do the trick. Only when you encounter a java.lang.OutOfMemoryError: PermGen space exception can you be sure that there really was no memory. This is a bit more involved than it should be!
How can we force the garbage collector to kick in? We can force a java.lang.OutOfMemoryError: PermGen space and then releasing the memory after which we force the garbage collector to kick in. I wrote the following servlet to do that:
package com.stc.test; import java.io.*; import java.util.ArrayList; import javax.servlet.*; import javax.servlet.http.*; public class RunGC extends HttpServlet { private static class XClassloader extends ClassLoader { private byte[] data; private int len; public XClassloader(byte[] data, int len) { super(RunGC.class.getClassLoader()); this.data = data; this.len = len; } public Class findClass(String name) { return defineClass(name, data, 0, len); } } protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); out.println("<html><body><pre>"); try { // Load class data byte[] buf = new byte[1000000]; InputStream inp = this.getClass().getClassLoader() .getResourceAsStream("com/stc/test/BigFatClass.class"); int n = inp.read(buf); inp.close(); out.println(n + " bytes read of class data"); // Exhaust permgen ArrayList keep = new ArrayList(); int nLoadedAtError = 0; try { for (int i = 0; i < Integer.MAX_VALUE; i++) { XClassloader loader = new XClassloader(buf, n); Class c = loader.findClass("com.stc.test.BigFatClass"); keep.add(c); } } catch (Error e) { nLoadedAtError = keep.size(); } // Release memory keep = null; out.println("Error at " + nLoadedAtError); // Load one more; this should trigger GC XClassloader loader = new XClassloader(buf, n); Class c = loader.findClass("com.stc.test.BigFatClass"); out.println("Loaded one more"); } catch (Exception e) { e.printStackTrace(out); } out.println("</pre></body></html>"); out.close(); }In this servlet a custom classloader is instantiated which loads a class in that classloader. That class is really present in the web classloader, but the custom classloader is tricked by not delegating to the parent classloader; instead the classloader is instantiating the class using the bytes of the class obtained through getResourceAsStream().
In the servlet it tries to allocate as many of these custom classes as possible, i.e. until the memory exception occurs. Next, the memory is made eligible for garbage collection, and one more classloader is allocated thereby forcing garbage collection.
The number of custom classes that can be loaded until a memory exception occurs, is a good measure of how much permgen space memory is available. As it turns out, this metric is a much more reliable than the one that you get from jconsole.
And more
Edward Chou is thinking of some other ideas to further automate the process of determining exactly where the cause of a classloader leak is. E.g. it should be possible to identifiy the erroneous reference (the red line in the picture) automatically, since this reference is from one classloader to another. Check his blog in the coming days.
Update (April 2007): You can find an interesting usage of jhat's Object Query Language on Sundarajan's blog to compute histograms of reference chains.
Posted at
04:10PM Oct 19, 2006
by Frank Kieviet in Sun |
Comments[65]
Permalink: http://blogs.sun.com/fkieviet/entry/how_to_fix_the_dreaded
Classloader leaks: the dreaded "java.lang.OutOfMemoryError: PermGen space" exception
Did you ever encounter a java.lang.OutOfMemoryError: PermGen space error when you redeployed your application to an application server? Did you curse the application server, while restarting the application server, to continue with your work thinking that this is clearly a bug in the application server. Those application server developers should get their act together, shouldn't they? Well, perhaps. But perhaps it's really your fault!
Take a look at the following example of an innocent looking servlet.
package com.stc.test; import java.io.*; import java.util.logging.*; import javax.servlet.*; import javax.servlet.http.*; public class MyServlet extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // Log at a custom level Level customLevel = new Level("OOPS", 555) {}; Logger.getLogger("test").log(customLevel, "doGet() called"); } }Try to redeploy this little sample a number of times. I bet this will eventually fail with the dreaded java.lang.OutOfMemoryError: PermGen space error. If you like to understand what's happening, read on.
The problem in a nutshell
Application servers such as Glassfish allow you to write an application (.ear, .war, etc) and deploy this application with other applications on this application server. Should you feel the need to make a change to your application, you can simply make the change in your source code, compile the source, and redeploy the application without affecting the other still running applications in the application server: you don't need to restart the application server. This mechanism works fine on Glassfish and other application servers (e.g. Java CAPS Integration Server).
The way that this works is that each application is loaded using its own classloader. Simply put, a classloader is a special class that loads .class files from jar files. When you undeploy the application, the classloader is discarded and it and all the classes that it loaded, should be garbage collected sooner or later.
Somehow, something may hold on to the classloader however, and prevent it from being garbage collected. And that's what's causing the java.lang.OutOfMemoryError: PermGen space exception.
PermGen space
What is PermGen space anyways? The memory in the Virtual Machine is divided into a number of regions. One of these regions is PermGen. It's an area of memory that is used to (among other things) load class files. The size of this memory region is fixed, i.e. it does not change when the VM is running. You can specify the size of this region with a commandline switch: -XX:MaxPermSize . The default is 64 Mb on the Sun VMs.
If there's a problem with garbage collecting classes and if you keep loading new classes, the VM will run out of space in that memory region, even if there's plenty of memory available on the heap. Setting the -Xmx parameter will not help: this parameter only specifies the size of the total heap and does not affect the size of the PermGen region.
Garbage collecting and classloaders
When you write something silly like
private void x1() {
for (;;) {
List c = new ArrayList();
}
}
you're continuously allocating objects; yet the program doesn't run out of memory: the objects that you create are garbage collected thereby freeing up space so that you can allocate another object. An object can only be garbage collected if the object is "unreachable". What this means is that there is no way to access the object from anywhere in the program. If nobody can access the object, there's no point in keeping the object, so it gets garbage collected. Let's take a look at the memory picture of the servlet example. First, let's even further simplify this example:
package com.stc.test; import java.io.*; import java.net.*; import javax.servlet.*; import javax.servlet.http.*; public class Servlet1 extends HttpServlet { private static final String STATICNAME = "Simple"; protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { } }
After loading the above servlet, the following objects are in memory (ofcourse limited to the relevant ones):
In this picture you see the objects loaded by the application classloader in yellow, and the rest in green. You see a simplified container object that holds references to the application classloader that was created just for this application, and to the servlet instance so that the container can invoke the doGet() method on it when a web request comes in. Note that the STATICNAME object is owned by the class object. Other important things to notice:
- Like each object, the Servlet1 instance holds a reference to its class (Servlet1.class).
- Each class object (e.g. Servlet1.class) holds a reference to the classloader that loaded it.
- Each classloader holds references to all the classes that it loaded.
To illustrate this, let's see what happens when the application gets undeployed: the Container object nullifies its references to the Servlet1 instance and to the AppClassloader object.
As you can see, none of the objects are reachable, so they all can be garbage collected. Now let's see what happens when we use the original example where we use the Level class:
package com.stc.test; import java.io.*; import java.net.*; import java.util.logging.*; import javax.servlet.*; import javax.servlet.http.*; public class LeakServlet extends HttpServlet { private static final String STATICNAME = "This leaks!"; private static final Level CUSTOMLEVEL = new Level("test", 550) {}; // anon class! protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { Logger.getLogger("test").log(CUSTOMLEVEL, "doGet called"); } }
Note that the CUSTOMLEVEL's class is an anonymous class. That is necessary because the constructor of Level is protected. Let's take a look at the memory picture of this scenario:
In this picture you see something you may not have expected: the Level class holds a static member to all Level objects that were created. Here's the constructor of the Level class in the JDK:
protected Level(String name, int value) { this.name = name; this.value = value; synchronized (Level.class) { known.add(this); } }
Here known is a static ArrayList in the Level class. Now what happens if the application is undeployed?
Only the LeakServlet object can be garbage collected. Because of the reference to the CUSTOMLEVEL object from outside of AppClassloader, the CUSTOMLEVEL anyonymous class objects (LeakServlet$1.class) cannot be garbage collected, and through that neither can the AppClassloader, and hence none of the classes that the AppClassloader loaded can be garbage collected.
Conclusion: any reference from outside the application to an object in the application of which the class is loaded by the application's classloader will cause a classloader leak.
More sneaky problems
I don't blame you if you didn't see the problem with the Level class: it's sneaky. Last year we had some undeployment problems in our application server. My team, in particular Edward Chou, spent some time to track them all down. Next to the problem with Level, here are some other problems Edward and I encountered. For instance, if you happen to use some of the Apache Commons BeanHelper's code: there's a static cache in that code that refers to Method objects. The Method object holds a reference to the class the Method points to. Not a problem if the Apache Commons code is loaded in your application's classloader. However, you do have a problem if this code is also present in the classpath of the application server because those classes take precedence. As a result now you have references to classes in your application from the application server's classloader... a classloader leak!
I did not mentiond yet the simplest recipe for disaster: a thread started by the application while the thread does not exit after the application is undeployed.
Detection and solution
Classloader leaks are difficult. Detecting if there's such a leak without having to deploy/undeploy a large number of times is difficult. Finding the source of a classloader leak is even trickier. This is because all the profilers that we tried at the time, did not follow links through classloaders. Therefore we resorted to writing some custom code to find the leaks from memory dump files. Since that exercise, new tools came to market in JDK 6. The next blog will outline what the easiest approach today is for tracking down a glassloader leak.
Posted at
11:40AM Oct 16, 2006
by Frank Kieviet in Sun |
Comments[53]
Permalink: http://blogs.sun.com/fkieviet/entry/classloader_leaks_the_dreaded_java
Transactions, disks, and performance
So far I've written only a handful of blogs. My blogs are
technical and dry and consequently the hit count of my blog is low.
However, it has generated some interest from people in far away places
and generated some interesting discussions through email. One of the
things that came up is data integrity and disk performance. And that's
the topic of this blog...
Maintaining data integrity
Consider a simple scenario: an MDB reads from a JMS queue, and for each message that it receives, it writes a record to a database. Let's say that you want to minimize the chance that you would lose or duplicate data in the case of failure. What kind of failure? Let's say that power failures or system crashes are your biggest worries.
How do we minimize the chance of losing or duplicating data? First
of all, you want to make sure that receiving the message from the
queue and the writing to the database happens in one single
transaction. One of the easiest ways of doing that is to use an
application server like Glassfish. How does Glassfish help? It starts a
distributed (two phase) transaction. The transactional state is
persisted by the JMS server and database, and by Glassfish through its
transaction log. In the event of a system crash, the system will come
back up and by using the persisted transactional state, Glassfish can
make sure that either the transaction is fully committed or fully
rolled back. Ergo, no data is lost or duplicated. The details of this
are fascinating, but let's look at that some other time.
Thus, reliable persistence that is immune to system crashes (e.g.
the Blue Screen of Death (BSOD)) or power failures is important. That
means that when a program thinks that the data has been persisted on
the disk, it in fact should have been written in the magnetic media of
the disk. Why do I stress this? Because it's less trivial than you
might think: the operating system and drive may cache the data in the
write cache. Therefore you will have to disable the write cache in the
OS on the one hand; in your application you will have to make sure that
the data is
indeed sync-ed to disk on the other hand.
Turning off the write cache, syncing to disk... no big deal, right?
Unfortunately there are many systems in which the performance drops
like a stone when you do that. Let's look at why that is, and how good
systems solve this performance problem.
The expense of writing to disk
As you know, data on a disk is organized in concentrical circles.
Each circle (track) is divided up in sectors. When the disk controller
writes data to the disk, it has to position the write head to the right
track and wait until the disk has rotated such that the beginning of
the sector is under the write head. These are mechanical operations;
they are not measured in nano seconds or micro seconds, but rather in
milliseconds. As it turns out, a drive can do about one hundred of
these operations per second. So, as a rule of thumb, a drive can
perform one hundred writes per second. And this hasn't changed much in
the past few decades, and it won't change in the next decade.
Writes per second is one aspect. The number of bytes that can be
written per second is another. This is measured in megabytes per
second. Fortunately over the past decades this rate has steadily
increased and it likely will continue to do so. That's why formatting a
300 Gb drive today doesn't take 10000 times longer than it took to
format a 30 Mb drive 15 years ago.
Key is that the more you can send to the disk in one write
operation, the more data you can write. To demonstrate this, I've
done some measurements on my desktop machine.

This chart shows that the number of write operations per second
remains constant up to approximately 32 kb. (Note that the x-axis is
logarithmic) That means that whether you try to write 32768 bytes or
just one single byte per write operation, the maximum number of writes
per
second remains the same. Of course the amount of data you can write to
the disk goes up if you put more data in a single write operation. Put
another way, if you need to write 1 kb chunks of data, you can
process 32 times faster if you combine your chunks into one 32 kb chunk.

What you can see in this chart is that the amount of data you can
write to the disk per second (data rate) increases with the number of
bytes per write. Of course that can not increase infinitely: the data
rate eventually becomes constant. (Note that the x-axis and y-axis are
both logarithmic). The data rate at a write size of 32 kb is approx 3.4
Mb/sec and levels off at a write size of approx 1 Mb to 6 Mb/sec.
These measurements were done on a $300 PC with a cheap hard drive
running Windows. I've done the same measurements on more expensive
hardware with more expensive drives and found similar results: most
drives perform slightly better, but the overall picture is the same.
The lesson: if you have to write small amounts of data, try to
combine these in one write operation.
Multi threading
Back to our example of JMS and a database. As I mentioned, for each
message picked up from
the queue and written to the database, there are several write
operations. For now, let's assume that the data written for the
processing
of single message cannot be combined into one write operation. However,
if we process the messages
concurrently, we can combine the data being written from multiple
threads in one write operation!
Good transaction loggers, good databases that
are optimize for transaction processing, and good JMS servers, all try
to combine the data
from multiple concurrent transactions into as few write operations as
possible. As an example, here are some measurements I did against STCMS
on the same hardware as above. STCMS is the JMS server that ships with
Java CAPS.

As you can see, for small messages the throughput of the system
increases almost linearly with the number of threads. This happens
because one of the design principles of STCMS is to consolidate as much
data as possible in a single write operation. Of course when you
increase the size of the messages, or if you increase the number of
threads indefinitely, you will eventually hit the limit on the
data rate. This is why the performance for large messages does not
scale like
it does for small messages. Of course when increasing the number of
threads to large numbers, you
will also hit other limits due to the overhead in thread switching.
Note that the measurements in this chart were done on a
queue-to-queue and topic-to-topic scenario about three years ago. The
STCMS shipping today performs a lot better, e.g. there's no difference
in performance anymore between queues and topics, so don't use these
numbers as benchmark numbers for STCMS; I'm just using them to prove
the point of the power of write-consolidation.
Other optimizations in the transaction manager
Combining data that needs to be written to the disk from many
threads into one write operation is definitely a good thing for both
resource managers (the JMS server and database server in our example)
and the transaction manager. Are there
other things that can be done specifically in the area of the
transaction manager? Yes, there are: let's see if we can minimize
the number of times data needs to be written for each message.
Last agent commit The first
thing that comes to mind is last-agent commit optimization.
That means that in our example, one of the resource managers will not
do a two phase commit, but instead only do a single phase commit,
thereby saving
another write. Most transaction managers can do this, e.g. the
transaction manager in Glassfish does this.
Piggy backing on another
persistence store Instead of writing to its own persistence
store, the transaction log could write its data to the persistence
store of one of the resource managers participating in the transaction.
By doing so, it can push its data into the same write operation that
the resource manager needs to do anyway. For instance, the transaction
manager can write its transaction state into the database. An extra
table in the database is all it takes. Alternatively, the JMS server
could provide some extensions so that the transaction log can write its
data to the persistence store of the JMS server.
Logless transactions If the
resource manager (e.g. JMS) doesn't have an API that allows the
transaction manager to use its persistence store, the transaction
manager could sneak in some data of its own in the XID. The XID is just
a string of bytes that the resource manager treats as an opaque entity.
The transaction manager could put some data in the XID that identifies
the last Resource Manager; in the case of a system crash, the
transaction manager will query all known resource managers and obtain a
list of in-doubt XIDs. The presence of the in-doubt XID of the last
resource in the transaction signifies that all resources were
successfully prepared and
that commit should be
called; the absence signifies that the prepare phase was incomplete and
that rollback should be
called. This mechanism was invented by SeeBeyond a few years ago
(patent pending). There are some intricacies in this system; perhaps a
topic for a future blog? Leave a comment to this blog if you're
interested.
Beyond the disk
The scenario we've been looking at is one in which data needs to be
persisted so that no data gets lost in case of a system crash or power
failure. What about a disk failure? We could use a RAID system.
Are there other ways? Sure! We could involve multiple computers:
clustering! If each machine that needs to safeguard transaction data
would also write its data to another node in a cluster, that data would
survive a crash of the primary node. Assuming that a crash of two nodes
at the same time is very unlikely, this forms a reliable solution. On
both nodes, data can be written to the disk using a write caching
scheme so that the number of writes per second is no longer the
limiting factor.
We can go even further for short-lived objects. Consider this
scenario:
insert into A
values ('a', 'b')
update A set
value = 'c' where pk = 'a'
delete A where
pk = 'a'
why write the value ('a', 'b') to the disk at all? Hence, a smart write
cache can avoid any disk writes for short lived objects. Of course
typical data that the transaction manager generates are short lived
objects. JMS messages are another example of objects that may have a
short life span.
A pluggable transaction manager... and what about Glassfish?
It would be nice if transaction managers had a pluggable
architecture so that depending on your specific scenario, you could
choose the appropriate transaction manager log persistence strategy.
Isn't clustering something difficult and expensive? Difficult to
develop yes, but not difficult to use. Sun Java System Application
Server 8.x EE already has strong support for clustering. Glassfish will
soon have support for clustering too. And with that clustering will
become available to everybody!
Posted at
10:30PM Oct 11, 2006
by Frank Kieviet in Sun |
Comments[6]
Permalink: http://blogs.sun.com/fkieviet/entry/transactions_disks_and_performance
A free course in Java EE (link)
I found this note in my mail:
The 11th session of "Java EE Programming (with Passion!)" free online course will start from October 23rd, 2006. The course contents are revised with new "look and feel" and new hands-on labs. Please see the following course website for more information. Course website: http://www.javapassion.com/j2ee Course FAQ: http://www.javapassion.com/j2ee/coursefaq.html
If you have anybody who wants to learn Java EE programing through hands-on style using NetBeans, please let them about this course.
Posted at
12:24PM Sep 18, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/a_free_course_in_java
Funny videos from Sun
Who knew... Sun makes pretty funny commercials. See here: Sun videos. While you're there, also take a look at Project LookingGlas... impressive stuff!
Posted at
02:05PM Sep 14, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/funny_videos_from_sun
Testing connection failures in resource adapters
In a previous blog (see J2EE JCA Resource Adapters: Poisonous pools) I talked about the difficulty to detect faulty connections to the external system (EIS) that the Resource Adapter is interfacing with. A common cause of connection failures is simply connection loss. In my previous blog I detailed the problems of detecting connection loss in the RA and presented a number of ways to address this problem.
What I didn't discuss is how you can build unit tests so that you
can easily test connection failures and connection failure recovery.
That is the topic of this blog.
What is it that I will be looking at? Say that you have a resource
adapter that connects to an external system (EIS) using one or more
TCP/IP connections. The failure I want to test is that of a simple
connection drop. This can happen if the EIS is rebooted, crashes, or
simply because the network temporarily fails (e.g. someone stumbles
over a network cable).
An automated way to induce failures
How would you manually test this type of failure? You could simply
kill the EIS, or unplug the network connection. Simple and effective,
but very manual. What I prefer is an automated way so that I
can include test failures in an automated unit test suite. Here's
a way to do that:
use a port-forwarder proxy. A port forwarder proxy is a process that
listens on a particular port. Any time a connection comes in on that
port, it will create a new connection to a specified server and port.
Bytes coming in from the client are sent to the server unmodified.
Likewise, bytes coming in from the server are sent to the client
unmodified. The
client is the RA in this case; the server is the EIS. Hence, the port
-forwarder proxy sits between the Resource Adapter and the EIS.
A failure can be induced by instructing the port-forwarder proxy to
drop all the connections. To simulate a transient failure, the
port-forwarder proxy should then again accept connections and forward
them to the EIS.
Here is an example of how this port-forwarder proxy can be used in a
JUnit test. Let's say that I'm testing a JMS resource adapter. Let's
say that I'm testing the RA in an application server: I have an MDB
deployed in the application server that reads JMS messages from one
queue (e.g. Queue1) and forwards them to a different queue (e.g.
Queue2). The test would look something like this:
- start the port-forwarder proxy; specify the server name and port
number that the EIS is listening on
- get the port number that proxy is listening on
- generate a connection URL that the RA will use to connect to the EIS; the new connection URL will have the proxy's server name and port number rather than the server name and port number of the EIS
- update the EAR file with the new connection URL
- deploy the EAR file
- send 1000 messages to Queue1
- read these 1000 message from Queue2
- verify that the port-forwarder has received connections; then tell the port-forwarder to kill all active connections
- send another batch of 1000 messages to Queue1
- read this batch of 1000 messages from Queue2
- undeploy the EAR file
As you can see I'm assuming in this example that you are using an
embedded resource adapter, i.e. one that is embedded in the EAR file.
Ofcourse you can also make this work using global resource adapters;
you just need a way to automatically configure the URL in the global
resource adapter.
A port forwarder in Java
My first Java program I ever wrote was a port-forwarder proxy
(Interactive Spy). I'm not sure when it was, but I do remember that
people just started to use JDK 1.3. In that version of the JDK there
was only blocking IO available. Based on that restriction a way to
write a port-forwarder proxy is to listen on a port and for each
incoming connection create two threads: one thread exclusively reads
from the client and sends the bytes to the server; the other thread
reads from the server and sends the bytes to the client. This was not a
very scalable or elegant solution. However, I did use this solution
successfully for a number of tests in the JMS test suite at SeeBeyond
for the JMS Intelligent Queue Manager in Java CAPS (better known to
engineers as STCMS). However, when I was developing the connection
failure tests for the JMSJCA Resource Adapter for JMS, I ran into these
scalability issues, and I saw test failures due to problems in the test
setup rather than to bugs in the application server, JMSJCA or STCMS.
A better approach is to make use of the non-blocking NIO
capabilities in JDK 1.4. For me this was the opportunity to explore the
capabilities of NIO. The result is a relatively small class that fully
stands on its own; I pasted the source code of the resulting class
below. The usual restrictions apply: this code is provided "as-is"
without warranty of any kind; use at your own risk, etc. I've omitted
the unit test for this class.
This is how you use it: instantiate a TCPProxyNIO passing it the
server name and port number of the EIS. The proxy will find a port to
listen on in the range of 50000. Use getPort() to find out what the
port number is. The proxy now listens on that port and is ready to
accept connections. Use killAllConnections()
to kill all connections. Make sure to destroy the proxy after use: call
close().
/*
* The contents of this file are subject to the terms
* of the Common Development and Distribution License
* (the "License"). You may not use this file except
* in compliance with the License.
*
* You can obtain a copy of the license at
* glassfish/bootstrap/legal/CDDLv1.0.txt or
* https://glassfish.dev.java.net/public/CDDLv1.0.html.
* See the License for the specific language governing
* permissions and limitations under the License.
*
* When distributing Covered Code, include this CDDL
* HEADER in each file and include the License file at
* glassfish/bootstrap/legal/CDDLv1.0.txt. If applicable,
* add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your
* own identifying information: Portions Copyright [yyyy]
* [name of copyright owner]
*/
package com.stc.jmsjca.test.core;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Random;
import java.util.Set;
import java.util.concurrent.Semaphore;
import java.util.logging.Level;
import java.util.logging.Logger;
/**
* A proxy server that can be used in JUnit tests to induce connection
* failures, to assure that connections are made, etc. The proxy server is
* setup with a target server and port; it will listen on a port that it
* chooses itself and delegates all data coming in to the server, and vice
* versa.
*
* Implementation: each incoming connection (client connection) maps into
* a Conduit; this holds both ends of the line, i.e. the client end
* and the server end.
*
* Everything is based on non-blocking IO (NIO). The proxy creates one
* extra thread to handle the NIO events.
*
* @author fkieviet
*/
public class TcpProxyNIO implements Runnable {
private static Logger sLog = Logger.getLogger(TcpProxyNIO.class.getName());
private String mRelayServer;
private int mRelayPort;
private int mNPassThroughsCreated;
private Receptor mReceptor;
private Map mChannelToPipes = new IdentityHashMap();
private Selector selector;
private int mCmd;
private Semaphore mAck = new Semaphore(0);
private Object mCmdSync = new Object();
private Exception mStartupFailure;
private Exception mUnexpectedThreadFailure;
private boolean mStopped;
private static final int NONE = 0;
private static final int STOP = 1;
private static final int KILLALL = 2;
private static final int KILLLAST = 3;
private static int BUFFER_SIZE = 16384;
/**
* Constructor
*
* @param relayServer
* @param port
* @throws Exception
*/
public TcpProxyNIO(String relayServer, int port) throws Exception {
mRelayServer = relayServer;
mRelayPort = port;
Receptor r = selectPort();
mReceptor = r;
new Thread(this, "TCPProxy on " + mReceptor.port).start();
mAck.acquire();
if (mStartupFailure != null) {
throw mStartupFailure;
}
}
/**
* Utility class to hold data that describes the proxy server
* listening socket
*/
private class Receptor {
public int port;
public ServerSocket serverSocket;
public ServerSocketChannel serverSocketChannel;
public Receptor(int port) {
this.port = port;
}
public void bind() throws IOException {
serverSocketChannel = ServerSocketChannel.open();
serverSocketChannel.configureBlocking(false);
serverSocket = serverSocketChannel.socket();
InetSocketAddress inetSocketAddress = new InetSocketAddress(port);
serverSocket.bind(inetSocketAddress);
}
public void close() {
if (serverSocketChannel != null) {
try {
serverSocketChannel.close();
} catch (Exception ignore) {
}
serverSocket = null;
}
}
}
/**
* The client or server connection
*/
private class PipeEnd {
public SocketChannel channel;
public ByteBuffer buf;
public Conduit conduit;
public PipeEnd other;
public SelectionKey key;
public String name;
public PipeEnd(String name) {
buf = ByteBuffer.allocateDirect(BUFFER_SIZE);
buf.clear();
buf.flip();
this.name = "{" + name + "}";
}
public String toString() {
StringBuffer ret = new StringBuffer();
ret.append(name);
if (key != null) {
ret.append("; key: ");
if ((key.interestOps() & SelectionKey.OP_READ) != 0) {
ret.append("-READ-");
}
if ((key.interestOps() & SelectionKey.OP_WRITE) != 0) {
ret.append("-WRITE-");
}
if ((key.interestOps() & SelectionKey.OP_CONNECT) != 0) {
ret.append("-CONNECT-");
}
}
return ret.toString();
}
public void setChannel(SocketChannel channel2) throws IOException {
this.channel = channel2;
mChannelToPipes.put(channel, this);
channel.configureBlocking(false);
}
public void close() throws IOException {
mChannelToPipes.remove(channel);
try {
channel.close();
} catch (IOException e) {
// ignore
}
channel = null;
if (key != null) {
key.cancel();
key = null;
}
}
public void listenForRead(boolean on) {
if (on) {
key.interestOps(key.interestOps() | SelectionKey.OP_READ);
} else {
key.interestOps(key.interestOps() &~ SelectionKey.OP_READ);
}
}
public void listenForWrite(boolean on) {
if (on) {
key.interestOps(key.interestOps() | SelectionKey.OP_WRITE);
} else {
key.interestOps(key.interestOps() &~ SelectionKey.OP_WRITE);
}
}
}
/**
* Represents one link from the client to the server. It is an association
* of the two ends of the link.
*/
private class Conduit {
public PipeEnd client;
public PipeEnd server;
public int id;
public Conduit() {
client = new PipeEnd("CLIENT");
client.conduit = this;
server = new PipeEnd("SERVER");
server.conduit = this;
client.other = server;
server.other = client;
id = mNPassThroughsCreated++;
}
}
/**
* Finds a port to listen on
*
* @return a newly initialized receptor
* @throws Exception on any failure
*/
private Receptor selectPort() throws Exception {
Receptor ret;
// Find a port to listen on; try up to 100 port numbers
Random random = new Random();
for (int i = 0; i < 100; i++) {
int port = 50000 + random.nextInt(1000);
try {
ret = new Receptor(port);
ret.bind();
return ret;
} catch (IOException ignore) {
// Ignore
}
}
throw new Exception("Could not bind port");
}
/**
* The main event loop
*/
public void run() {
// ===== STARTUP ==========
// The main thread will wait until the server is actually listening and ready
// to process incoming connections. Failures during startup should be
// propagated back to the calling thread.
try {
selector = Selector.open();
// Acceptor
mReceptor.serverSocketChannel.configureBlocking(false);
mReceptor.serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);
} catch (Exception e) {
synchronized (mCmdSync) {
mStartupFailure = e;
}
}
// ===== STARTUP COMPLETE ==========
// Tha main thread is waiting on the ack lock; notify the main thread.
// Startup errors are communicated through the mStartupFailure variable.
mAck.release();
if (mStartupFailure != null) {
return;
}
// ===== RUN: event loop ==========
// The proxy thread spends its life in this event handling loop in which
// it deals with requests from the main thread and from notifications from
// NIO.
try {
loop: for (;;) {
int nEvents = selector.select();
// ===== COMMANDS ==========
// Process requests from the main thread. The communication mechanism
// is simple: the command is communicated through a variable; the main
// thread waits until the mAck lock is set.
switch (getCmd()) {
case STOP: {
ack();
break loop;
}
case KILLALL: {
PipeEnd[] pipes = toPipeArray();
for (int i = 0; i < pipes.length; i++) {
pipes[i].close();
}
ack();
continue;
}
case KILLLAST: {
PipeEnd[] pipes = toPipeArray();
Conduit last = pipes.length > 0 ? pipes[0].conduit : null;
if (last != null) {
for (int i = 0; i < pipes.length; i++) {
if (pipes[i].conduit.id > last.id) {
last = pipes[i].conduit;
}
}
last.client.close();
last.server.close();
}
ack();
continue;
}
}
//===== NIO Event handling ==========
if (nEvents == 0) {
continue;
}
Set keySet = selector.selectedKeys();
for (Iterator iter = keySet.iterator(); iter.hasNext();) {
SelectionKey key = (SelectionKey) iter.next();
iter.remove();
//===== ACCEPT ==========
// A client connection has come in. Perform an async connect to
// the server. The remainder of the connect is going to be done in
// the CONNECT event handling.
if (key.isValid() && key.isAcceptable()) {
sLog.fine(">Incoming connection");
try {
Conduit pt = new Conduit();
ServerSocketChannel ss = (ServerSocketChannel) key.channel();
// Accept
pt.client.setChannel(ss.accept());
// Do asynchronous connect to relay server
pt.server.setChannel(SocketChannel.open());
pt.server.key = pt.server.channel.register(
selector, SelectionKey.OP_CONNECT);
pt.server.channel.connect(new InetSocketAddress(
mRelayServer, mRelayPort));
} catch (IOException e) {
System.err.println(">Unable to accept channel");
e.printStackTrace();
// selectionKey.cancel();
}
}
//===== CONNECT ==========
// Event that is generated when the connection to the server has
// completed. Here we need to initialize both pipe-ends. Both ends
// need to start reading. If the connection had not succeeded, the
// client needs to be closed immediately.
if (key != null && key.isValid() && key.isConnectable()) {
SocketChannel c = (SocketChannel) key.channel();
PipeEnd p = (PipeEnd) mChannelToPipes.get(c); // SERVER-SIDE
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">CONNECT event on " + p + " -- other: " + p.other);
}
boolean success;
try {
success = c.finishConnect();
} catch (RuntimeException e) {
success = false;
if (sLog.isLoggable(Level.FINE)) {
sLog.log(Level.FINE, "Connect failed: " + e, e);
}
}
if (!success) {
// Connection failure
p.close();
p.other.close();
// Unregister the channel with this selector
key.cancel();
key = null;
} else {
// Connection was established successfully
// Both need to be in readmode; note that the key for
// "other" has not been created yet
p.other.key = p.other.channel.register(selector, SelectionKey.OP_READ);
p.key.interestOps(SelectionKey.OP_READ);
}
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END CONNECT event on " + p + " -- other: " + p.other);
}
}
//===== READ ==========
// Data was received. The data needs to be written to the other
// end. Note that data from client to server is processed one chunk
// at a time, i.e. a chunk of data is read from the client; then
// no new data is read from the client until the complete chunk
// is written to to the server. This is why the interest-fields
// in the key are toggled back and forth. Ofcourse the same holds
// true for data from the server to the client.
if (key != null && key.isValid() && key.isReadable()) {
PipeEnd p = (PipeEnd) mChannelToPipes.get(key.channel());
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">READ event on " + p + " -- other: " + p.other);
}
// Read data
p.buf.clear();
int n;
try {
n = p.channel.read(p.buf);
} catch (IOException e) {
n = -1;
}
if (n >= 0) {
// Write to other end
p.buf.flip();
int nw = p.other.channel.write(p.buf);
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">Read " + n + " from " + p.name + "; wrote " + nw);
}
p.other.listenForWrite(true);
p.listenForRead(false);
} else {
// Disconnected
if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Disconnected");
}
p.close();
key = null;
if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Now present: " + mChannelToPipes.size());
}
p.other.close();
// Stop reading from other side
if (p.other.channel != null) {
p.other.listenForRead(false);
p.other.listenForWrite(true);
}
}
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END READ event on " + p + " -- other: " + p.other);
}
}
//===== WRITE ==========
// Data was sent. As for READ, data is processed in chunks which
// is why the interest READ and WRITE bits are flipped.
// In the case a connection failure is detected, there still may be
// data in the READ buffer that was not read yet. Example, the
// client sends a LOGOFF message to the server, the server then sends
// back a BYE message back to the client; depending on when the
// write failure event comes in, the BYE message may still be in
// the buffer and must be read and sent to the client before the
// client connection is closed.
if (key != null && key.isValid() && key.isWritable()) {
PipeEnd p = (PipeEnd) mChannelToPipes.get(key.channel());
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">WRITE event on " + p + " -- other: " + p.other);
}
// More to write?
if (p.other.buf.hasRemaining()) {
int n = p.channel.write(p.other.buf);
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">Write some more to " + p.name + ": " + n);
}
} else {
if (p.other.channel != null) {
// Read from input again
p.other.buf.clear();
p.other.buf.flip();
p.other.listenForRead(true);
p.listenForWrite(false);
} else {
// Close
p.close();
key = null;
if (sLog.isLoggable(Level.FINE)) {
sLog.fine("Now present: " + mChannelToPipes.size());
}
}
}
if (sLog.isLoggable(Level.FINE)) {
sLog.fine(">END WRITE event on " + p + " -- other: " + p.other);
}
}
}
}
} catch (Exception e) {
sLog.log(Level.SEVERE, "Proxy main loop error: " + e, e);
e.printStackTrace();
synchronized (mCmdSync) {
mUnexpectedThreadFailure = e;
}
}
// ===== CLEANUP =====
// The main event loop has exited; close all connections
try {
selector.close();
PipeEnd[] pipes = toPipeArray();
for (int i = 0; i < pipes.length; i++) {
pipes[i].close();
}
mReceptor.close();
} catch (IOException e) {
sLog.log(Level.SEVERE, "Cleanup error: " + e, e);
e.printStackTrace();
}
}
private PipeEnd[] toPipeArray() {
return (PipeEnd[]) mChannelToPipes.values().toArray(
new PipeEnd[mChannelToPipes.size()]);
}
private int getCmd() {
synchronized (mCmdSync) {
return mCmd;
}
}
private void ack() {
setCmd(NONE);
mAck.release();
}
private void setCmd(int cmd) {
synchronized (mCmdSync) {
mCmd = cmd;
}
}
private void request(int cmd) {
setCmd(cmd);
selector.wakeup();
try {
mAck.acquire();
} catch (InterruptedException e) {
// ignore
}
}
/**
* Closes the proxy
*/
public void close() {
if (mStopped) {
return;
}
mStopped = true;
synchronized (mCmdSync) {
if (mUnexpectedThreadFailure != null) {
throw new RuntimeException("Unexpected thread exit: " + mUnexpectedThreadFailure, mUnexpectedThreadFailure);
}
}
request(STOP);
}
/**
* Restarts after close
*
* @throws Exception
*/
public void restart() throws Exception {
close();
mChannelToPipes = new IdentityHashMap();
mAck = new Semaphore(0);
mStartupFailure = null;
mUnexpectedThreadFailure = null;
mReceptor.bind();
new Thread(this, "TCPProxy on " + mReceptor.port).start();
mStopped = false;
mAck.acquire();
if (mStartupFailure != null) {
throw mStartupFailure;
}
}
/**
* Returns the port number this proxy listens on
*
* @return port number
*/
public int getPort() {
return mReceptor.port;
}
/**
* Kills all connections; data may be lost
*/
public void killAllConnections() {
request(KILLALL);
}
/**
* Kills the last created connection; data may be lost
*/
public void killLastConnection() {
request(KILLLAST);
}
/**
* Closes the proxy
*
* @param proxy
*/
public void safeClose(TcpProxyNIO proxy) {
if (proxy != null) {
proxy.close();
}
}
}
Other uses of the port-forwarding proxy
What else can you do with this port-forwarding proxy? First of all,
its use is not limited to outbound connections: ofcourse you can also
use it to test connection failures on inbound connections. Next,
you can also use it to make sure that connections are infact being made
the way that you expected. For example, the CAPS JMS server can use
both SSL and non SSL connections. I added a few tests to our internal
test suite to test this capability from JMSJCA. Here, just to make sure
that the test itself works as expected, I'm using the proxy to find out
if the URL was indeed modified and that the connections are indeed
going to the JMS server's SSL port. Similary, you can also use the
proxy to count the number connections being made. E.g. if you want to
test that connection pooling indeed works, you can use this proxy and
assert that the number of connections created does not exceed the
number of connections in the pool.
Missing components
In the example I mentioned updating the EAR file and automatically
deploying the EAR file to the application server. I've created tools
for those too so that you can do those things programmatically as well.
Drop me a line if you're interested and I'll blog about those too.
Posted at
03:59PM Sep 10, 2006
by Frank Kieviet in Sun |
Comments[1]
Permalink: http://blogs.sun.com/fkieviet/entry/testing_connection_failures_in_resource
"When Connection.close() should not close", or the J2EE JCA ManagedConnection life cycle
One of the things I've struggled with most when I was developing the JMSJCA Resource Adapter at SeeBeyond, was the life cycle of the ManagedConnection. I wasted numerous hours going into fruitless directions simply because I did not fully grasp the intricacies of the ManagedConnection life cycle. If you're involved in developing Resource Adapters, you're likely to stumble onto the same problems, so read on!
What do you mean, close() should not close?
Let's take a look at how you might use a JMS Connection in an EJB:
@Resource(name="jms/cf") ConnectionFactory cf;
@Resource(name = "jmx/q1") Queue q;
@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void myBusinessMethod() throws Exception {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close();
}
Should c.close()
really close the connection?
Keep in mind that since J2EE 1.4, JMS providers interface with the
application server using JCA. JMS connections should be pooled so that
repeated execution of the statement fact.createConnection() is
cheap. So the answer is no,
the connection should not really be closed.
Should c.close()
return the connection to the pool then, so that another EJB could use
it? Remember that there is a transaction in progress, so the container
should hold on to the connection until the transaction is completed.
Only then should the container return the connection to the pool.
Are you appreciating yet the complexities that the application
server
has to deal with? If not, let's add the following to the method above:
@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void myBusinessMethod() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(s.createQueue("q")).send(s.createTextMessage("Hello world"));
c.close();
otherMethod();
}
public void otherMethod() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(s.createQueue("q")).send(s.createTextMessage("Goodbye world"));
c.close();
}
When myBusinessMethod()
is called, which in turn calls otherMethod(),
how many connections are created? Good application servers like the
Java CAPS Integration Server, the Sun Java System Application Server,
and Glassfish use only one connection in this example. The
connection that is used in otherMethod()
is the same connection that was created in myBusinessMethod().
Let's look in more detail at what is happening under the covers,
and why that is important.
Alphabet soup
Before we begin, let me reiterate some of the terms and abbreviations used. JCA stands for the Java Connector Architecture. It provides the interfacing between applications running in an application server, and external systems such as CRM packages, but also systems such as JMS.
A Resource Adapter is a set of classes that implement the JCA. The
central interface for outbound communications is the ManagedConnection. An
application, e.g. an EJB, never gets access to a a ManagedConnection directly.
Instead, it gets a connection handle.
The handle in the above example is the JMS Connection (actually, it is the
JMS Session -- but let's
not
go into that right now). This Connection
object is not the JMS Connection
that is implemented by the JMS provider, but is a wrapper around such a
connection. The wrapper is implemented by the Resource Adapter.
A ManagedConnection
holds the physical connection to the external system, e.g. the JMS Connection and Session. Because the physical
connection and hence the ManagedConnection
is expensive to create, the application server tries to pool the ManagedConnections.
State transitions, and why they are important
We've already seen that a ManagedConnection can be in an idle state or pooled state and it can be in use. It would be very nice if the application server would tell the ManagedConnection when these state transitions happen. But it doesn't. Why would the ManagedConnection care to know when it is being used or when it is being returned to the pool? Let's look at an example in JMS. Let's change the example a little bit:
@TransactionAttribute(TransactionAttributeType.REQUIRED)
public void usingTemp() throws Exception {
ConnectionFactory fact = (ConnectionFactory) new InitialContext().lookup("jms/cf");
Connection c = fact.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
Queue temp = s.createTemporaryQueue();
Message request = s.createTextMessage("541897-9841");
request.setJMSReplyTo(temp);
sendRequest(request);
Message reply = s.createConsumer(temp).receive();
c.close();
}
The sendRequest()
method would somehow start a new transaction and send the request
message. The question is when the temporary destination should be
deleted. According to the JMS spec, the temporary destination should be
invalidated as soon as the connection is closed. However, a transaction
is still in progress, so the JMS provider will throw an exception if
the
temporary destination is deleted when c.close() is called. Can't we
just ignore the temporary destination? After all, it will be deleted some time, e.g. when the physical
JMS connection is closed. However, since the connection is pooled, it
may take a long time before the physical JMS connection is closed.
During that time temporary destinations just keep piling up. This
approach will exhaust the JMS server.
The temporary destination can be deleted safely when the ManagedConnection is returned
to the pool. And
that's why it's important to know the state transitions.
How to detect state transitions
What hints does the application server give the ManagedConnection about the
state transitions?
Let's take a look at the ManagedConnection
interface:
public interface ManagedConnection {
Object getConnection(Subject subject, ConnectionRequestInfo connectionRequestInfo) throws ResourceException;
void destroy() throws ResourceException;
void cleanup() throws ResourceException;
void associateConnection(Object object) throws ResourceException;
void addConnectionEventListener(ConnectionEventListener connectionEventListener);
void removeConnectionEventListener(ConnectionEventListener connectionEventListener);
XAResource getXAResource() throws ResourceException;
LocalTransaction getLocalTransaction() throws ResourceException;
ManagedConnectionMetaData getMetaData() throws ResourceException;
void setLogWriter(PrintWriter printWriter) throws ResourceException;
PrintWriter getLogWriter() throws ResourceException;
}
The getConnection()
method tells the ManagedConnection
to
create a new connection handle, so after that method the ManagedConnection knows that it
is not in the
pooled state.
The destroy() method
destroys the ManagedConnection
so it
signals a state transition to "non-existent".
Doesn't the cleanup()
method indicate that a connection is no longer used and will be
returned to the pool? It depends on the application server when exactly
this method is called. Most application servers will call cleanup() immediately when the
application calls Connection.close().
At that moment the connection may still be enlisted in a transaction,
and may be reused as we've seen in the examples above.
As it turns out, there are two states that the ManagedConnection needs to keep
track of so
that it can detect a state transition from "in-use" to "pooled".
The two diagrams keep track of whether the application has access to
the ManagedConnection
through a
connection handle. In other words, it keeps track of whether the ManagedConnection has any
outstanding
connection handles. See the figure below.

The second state is a transactional state: it keeps
track of whether the ManagedConnection
is enlisted in a transaction. See the figure below:

Let's look at a few examples that illustrate this concept:
@Resource EJBContext ctx;
@Resource(name="jms/cf") ConnectionFactory cf;
@Resource(name = "jmx/q1") Queue q;
public void ex1() {
ctx.getUserTransaction().begin();
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close();
ctx.getUserTransaction().commit();
}
In this example the getConnection()
method is called upon cf.createConnection().createSession():
the ManagedConnection is
in use. At the
same time the connection is enlisted in the transaction; this is done
through the XAResource
obtained through ManagedConnection.getXAResource().
When c.close() is called,
the ManagedConnection is
no longer
accessible to the application: all the connection handles are closed.
The application server may and probably will call ManagedConnection.cleanup().
However, the connection is still enlisted in the transaction, so the
connection is not returned to the pool yet. That happens when getUserTransaction().commit()
is called. While the connection is still enlisted, the resource adapter
should not try to delete any objects such as temporary destinations in
the example mentioned above.
In the following example the enlistment happens a little later. See
the inline comments for the state transitions.
public void ex2() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted
ctx.getUserTransaction().commit(); // Inaccessible and not enlisted
// Return to pool
}
There is also a possible transition from "Inaccessible and enlisted"
back to "Accessible and enlisted":
public void ex3() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted
c = cf.createConnection();
s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
c.close(); // Inaccessible and enlisted
ctx.getUserTransaction().commit(); // Inaccessible and not enlisted
// Return to pool
}
The following example shows that there can be multiple connection handles:
public void ex4() {
Connection c = cf.createConnection();
Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE); // Accessible, but not enlisted
ctx.getUserTransaction().begin(); // Accessible and enlisted
s.createProducer(q).send(s.createTextMessage("Hello world"));
Connection c2 = cf.createConnection();
Session s2 = c.createSession(false, Session.AUTO_ACKNOWLEDGE);// Accessible and enlisted
s2.createProducer(q).send(s2.createTextMessage("Hello world"));
c2.close(); // Accessible and enlisted
ctx.getUserTransaction().commit(); // Accessible and not enlisted
c.close(); // Inaccessible and not enlisted
// Return to pool
}
How to monitor transaction enlistment
To monitor enlistment and the commit()
or rollback() method on
the XAResource it is
possible to write a wrapper around the XAResource of the underlying
provider. There are some drawbacks associated with that, and it is
better to monitor the transaction through a javax.transaction.Synchronization
object. I've described this in more detail in my blog of July
23, 2006: J2EE JCA Resource Adapters: The problem with XAResource
wrappers .
Conclusion
By keeping track of both the number of outstanding connection
handles given out to the application and the enlistment in a
transaction, the ManagedConnection
can figure out when it is returned to the pool so that it can undertake
necessary actions such as destruction of objects indirectly created by
the application.
Posted at
10:56PM Aug 26, 2006
by Frank Kieviet in Sun |
Comments[18]
Permalink: http://blogs.sun.com/fkieviet/entry/when_connection_close_should_not
JMS request/reply from an EJB
Sometimes something that appears trivial turns out to be a lot more
involved. Doing JMS request/reply from an EJB is such a case. Let's
take a look...
Request/reply
JMS can be used as a form of inter process communication. JMS is
meant for asynchronous calls, but it can also used for synchronous
calls. In that case, the first process sends a message to a queue or
topic, and waits until another process has replied to the message. The
reply is sent to a different queue or topic. Advantages of using JMS
for interprocess communication is that the two processes are completely
decoupled. The two processes only need to agree on what is in the
message.
Synchronous communication in JMS is also called request/reply.
The JMS api provides two helper classes to make request/reply
easier. They are the QueueRequestor
and TopicRequestor classes.
Here is an example of how they can be used:
QueueConnection c = (QueueConnectionFactory) (new
InitialContext().lookup("qcf"));
QueueSession s = c.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
Queue q = (Queue) (new InitialContext().lookup("requestqueue"));
QueueRequestor r = new QueueRequestor(s, q);
Message reply = r.request(s.createTextMessage("1233442342"));
c.close();
Looks simple, right? However, when this code is used in a typical
EJB, it will likely not work. Why not? Is there something wrong with
the QueeuRequestor? Well,
yes, yet that is not the problem. Let's look at the implementation of
the QueueRequestor:
public class QueueRequestor {
QueueSession session;
Queue queue;
TemporaryQueue tempQueue;
QueueSender sender;
QueueReceiver receiver;
public QueueRequestor(QueueSession session, Queue queue) throws JMSException {
this.session = session;
this.queue = queue;
tempQueue = session.createTemporaryQueue();
sender = session.createSender(queue);
receiver = session.createReceiver(tempQueue);
}
public Message request(Message message) throws JMSException {
message.setJMSReplyTo(tempQueue);
sender.send(message);
return receiver.receive();
}
public void close() throws JMSException {
session.close();
tempQueue.delete();
}
}
So we can rewrite the above code sample as follows:
QueueConnection c = (QueueConnectionFactory) (new InitialContext().lookup("qcf"));
QueueSession s = c.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
Queue q = (Queue) (new
InitialContext().lookup("requestqueue"));
Message request = s.createTextMessage("1233442342");
request.setJMSReplyTo(s.createTemporaryQueue();
s.createSender(q).send(m);
Message reply = s.createReceiver(q).receive();
c.close();
There's nothing obviously wrong with that, so why does this
typically not work when this code is in an EJB?
Transactions
Before we look at the explanation of why the above code doesn't work
in an EJB, let's review some basics about application servers and JMS
servers. A JMS server is typically a stand alone process. Applications
can communicate with this server using a client runtime. The client
runtime implements the JMS API as defined in the JMS specification. If
you're building a stand-alone application, you use the client runtime
directly. Example, when you're using the JMS server shipped with Java
CAPS you would be using com.stc.stcjms.jar
to talk to the server process (stcms.exe).
The client runtime implemented in com.stc.jms.stcjms.jar
implements the JMS api and communicates with the server (stcms.exe) over tcp/ip sockets.
However, when your code lives in an EJB, you're not using the JMS
client runtime directly. For example, when you obtain a connection
factory from JNDI in an EJB, you will not get the connection factory
object that belongs to the JMS provider itself, but you get a factory
that is implemented by a resource adapter. As of J2EE 1.4, the
interfacing between the application server and JMS is done through a
Resource Adapter. In the example for Java CAPS, you would obtain a
connection factory implemented by the JMSJCA resource adapter instead
of the com.stc.jms.client.STCQueueConnectionFactory
that belongs to the STCMS
client runtime.
The Resource Adapter acts like a wrapper around the objects that are
implemented in the JMS client runtime. One of the responsibilities of
the Resource Adapter is to make sure that JMS acts like a transactional
resource. This means that JMS connection (technically, the JMS session)
is enlisted with the transaction manager of the applicaiton server. In
fact, under the covers the Resource Adapter really uses an XASession rather than an AUTO_ACKNOWLEDGE session. What
effect did it have to specify the arguments to
c.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
In fact, these arguments are ignored, so they have no effect whatsoever.
Use of JMS in a transaction
Let's say that we have a bean managed EJB (BMT) with the following code:
EJBContext ctx = ...;
ctx.getUserTransaction().begin();
QueueConnection c = (QueueConnectionFactory) (new
InitialContext().lookup("qcf"));
QueueSession s = c.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
Queue q = (Queue) (new InitialContext().lookup("requestqueue"));
Message request = s.createTextMessage("1233442342");
request.setJMSReplyTo(s.createTemporaryQueue();
s.createSender(q).send(m);
Message reply = s.createReceiver(q).receive();
c.close();
ctx.getUserTransaction().commit();
The message will be sent to the queue only when the statement getUserTransaction().commit()
is executed. Before this statement, the message will not be visible
anywhere. So when the receive()
method is executed, the message was really not sent to the queue. The
other process never gets the request, and hence, a reply will never be
received. Since no timeout is specified, the application simply hangs.
Now that we know that, the solution is pretty simple: just commit
the transaction before receiving the reply.
ctx.getUserTransaction().begin();
QueueConnection c = (QueueConnectionFactory) (new InitialContext().lookup("qcf"));
QueueSession s = c.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);
Queue q = (Queue) (new InitialContext().lookup("requestqueue"));
Message request = s.createTextMessage("1233442342");
request.setJMSReplyTo(s.createTemporaryQueue();
s.createSender(q).send(m);
ctx.getUserTransaction().commit();
ctx.getUserTransaction().begin();
Message reply = s.createReceiver(q).receive();
c.close();
ctx.getUserTransaction().commit();
It depends on the business logic if you can simply split up the
transaction in two. Perhaps something else happens in the transaction,
for example you could update a database in the same transaction. In
that case, you may not want to commit the transaction yet! How can we
deal with that situation?
The solution would be to suspend the transaction, create a new
transaction, send the request, commit the new transaction, unsuspend
the first transaction, and continue. Sounds complicated? It is, but it
can be simplified using an EJB: simply add a new business method to the
EJB, make sure that the method's trans-attribute
is set to NotSupported.
Next, instantiate a new EJB and call that method. By calling that
method, the transaction will automatically be suspended.
Note that it is crucial to instantiate a new EJB: if you just call
the new business method from the same EJB instance, the application
server will not intercept the method and will not suspend the
transaction.
When the EJB is CMT
Pretty much the same goes if the EJB is not BMT but CMT. If a
transaction was started, the same deadlock occurs: the send() is never committed until
the transaction is committed, so the receive() method will never
receive a reply. In that case, add a new business method and make sure
that the trans-attribute
is set to RequiresNew.
The application server will then suspend the main transaction and start
a new transaction automatically. As with BMT, make sure that you call
the new business method on a new instance of the EJB rather than on the
same instance.
When there is no transaction
What happens if there is no transaction when you call send()? Well, that depends on the implementation. The specification is not clear on what should happen when an XA resource is used outside of a transaction. The JMS CTS implies that the behavior should be that of auto-commit (or AUTO_ACKNOWLEDGE in JMS parlance). CTS stands for Compliance Test Suite. The CTS for JMS consists of a few thousand tests. Every implementation that claims that be compliant with the JMS spec must pass these tests. Hence, most implementations, including the one that ships with Java CAPS will exhibit AUTO_ACKNOWLEDGE behavior. In that case, the above code will work and the QueueRequestor can be used.What about the QueueRequestor?
Can't we use the QueueRequestor
with transactions? The simple answer is no. Why did the Resource
Adapter not take care of this? The problem is that the QueueRequestor is implemented
by the JMS api, and not by the JMS provider or Resource Adapter.
Perhaps this is another example why it is generally a bad idea to
provide implementation classes in API-s?
There's another problem with the QueueRequestor class: there is
no way to specify a timeout value. You would probably prefer a timeout
exception rather than having a hanging application.
Conclusion
Knowing what happens under the covers explains why request/reply
cannot work in a single transaction. Simply post the request in a
separate transaction.
Posted at
04:35PM Aug 13, 2006
by Frank Kieviet in Sun |
Comments[23]
Permalink: http://blogs.sun.com/fkieviet/entry/request_reply_from_an_ejb
J2EE JCA Resource Adapters: Poisonous pools
Introduction
As a developer of a JCA Resource Adapter (RA) you're responsible for all aspects between the EIS and the EJB, including connection failures so that poisoned pools are avoided.
Wait! Too many acronyms in one sentence? Poisoned pools? What am I
talking about? Here's a short
refresher. JCA is the Java Connector Architecture and defines how
Enterprise Java Beans (your application) can communicate with
Enterprise Information Systems (EIS). Examples of Enterprise
Information Systems are ERP systems, CRM systems and as well as other
enterprise systems such as databases and JMS. The "conduit" between the
EJB and the EIS is the Resource Adapter (RA). The communication can
originate both from the EIS and from the EJB. The former is called
inbound, the latter is outbound. In this write-up I'm looking at
outbound only.
Creating a connection from your application to an EIS is often expensive. That is why the container (the application server) provides for connection pooling so that connections can be reused rather than getting recreated. This brings with it that there is a risk that faulty connections accumulate in the pool, thus causing a poisoned pool. In such a situation the application can no longer communicate with the EIS.
Seem's simple enough, doesn't it? Let's look a bit closer...
Typical time scales
One of the services that an application server provides to applications is the pooling of resources. As such, when your application uses an outbound connection of a resource adapter, the application server will maintain a pool of connections. When your application needs a connection, the application server tries to satisfy this request first by checking the pool of idle connections; if there are no idle connections, a new connection is created or the application is blocked until a connection is returned to the pool. When the application closes a connection, it is returned to the pool.
Creating a new connection typically involves creating one or more new TCP/IP connections, authentication by the EIS, creating an internal session in the EIS and its associated memory structures and data, etc. This makes creating a new connection expensive, the time scale of connection creation is usually in the order of 50-300 ms. These expensive operations can be avoided when reusing an idle connection: the time scale of reuse is measured in microseconds rather than microseconds. Next to consider is the time scale of connection use by your application, typically often in the range of 1-5 ms.
To show the effects of connection pooling on throughput, let's
assume that your application takes 3 ms to process a request, and that
the time it takes to create a new connection is 300 ms, while re-using
a connection takes 0.03 ms. The processing time is 3.03 ms with
pooling, and 303 ms without pooling. A difference of a factor hundred!
Sure, I made up the numbers in this example, but they are likely close
to what you'll encounter in every day practice.
In addition to the sub-second time scale of connection use and creation, there is another timescale to consider: the typical time scale of the duration of a failure. Communication failures are most often caused by the EIS becoming unavailable temporarily. This can have two causes: a loss of network connectivity, or because of the EIS being restarted. The latter is more probable and will be considered here as the typical failure scenario. Restarting an EIS typically takes from half a minute to several minutes. It is important to keep this timescale in mind when considering error handling strategies.
Mechanics of connection pooling
An outbound connection from the application to the EIS is represented by a ManagedConnection. The ManagedConnection holds the physical connection to the EIS. The lifecycle of a ManagedConnection is under control of the application server: a resource adapter creates a ManagedConnection when the application server tells it to; likewise the resource adapter destroys a connection only when instructed to do so by the application server.
A problem occurs when there is a communication failure with the EIS. For example, if a resource adapter connects to an external EIS, and this external server is restarted, the connections in the pool are all invalid. If the application were to use one of these connections, a failure would certainly occur. The failure would be propagate to your application code through an exception. A likely result would be that the transaction would be rolled back, and that the operation would be attempted again. The application server may not be able to distinguish this communication failure due to a faulty connection from other errors, so it may use the same faulty connection again on the next attempt, thereby ensuring that the same problem will happen for the next transaction. Effectively, the whole application has become inoperable because of the “poisoned pool”. To break out of this cycle, the resource adapter should let the application server know that connections are faulty so that the application server then can make the resource adapter recreate a new connection and avoid putting faulty connections back in the pool. There are several ways to do this.
- Signal to the application server that a connection is no longer valid
- Respond negatively when the application server asks whether a connection is valid
Signalling the application server that a ManagedConnection is faulty
After the application server instructs the Resource Adapter to create a new ManagedConnection, it calls the following method on that ManagedConnection:
public interface ManagedConnection
{
addConnectionEventListener(ConnectionEventListener listener)
// other methods omitted for clarity
}
The managed connection uses this ConnectionEventListener object to notify the application server of connections being closed. This object can also be used to let the application server know that the connection is faulty using the CONNECTION_ERROR_OCCURRED event. Upon receiving this event, the application server will typically destroy the connection immediately.
This approach can only be used if the resource adapter has some way of finding out when the connection is broken. In practice this turns out to be quite difficult: most resource adapters are not written from from scratch but rather make use of some client jar that takes care of the communication with the EIS. Often, the vendor of the resource adapter is not the vendor of the EIS, and even if this were the case, the vendor of the EIS most likely needs to make a client jar available independent from a resource adapter anyways. It is not uncommon that these client jars don't provide any mechanism to propagate connection failures to the caller. For instance, if the EIS is JMS, the client jar will expose only the JMS API and there is no way in the JMS API to tell the caller of a method that the physical connection is faulty.
Because the application server will destroy the connection immediately upon receiving the CONNECTION_ERROR_OCCURRED event and will roll back the transaction, it is important that the ManagedConnection does not throw this event as a result of an application condition rather than a faulty connection.
Fortunately there are alternatives to the CONNECTION_ERROR_OCCURRED mechanism.
The ValidatingManagedConnectionFactory
Rather than telling the application server that the connection is faulty, the managed connection can also wait for the application server to ask the managed connection whether it is still a valid connection. To make that work, the managed connection factory needs to implement the ValidatingManagedConnectionFactory interface:
public interface
ValidatingManagedConnectionFactory
{
Set getInvalidConnections(Set connectionSet)
}
The application server will check if the managed connection factory implements this interface, and if so, it will call the getInvalidConnections() method periodically.
Again it is often a problem for the managed connection factory to know if a connection is valid or not. For example, if the resource adapter wraps a database connection, there is often no way to find out if a connection is still "live" or not. E.g. on a JDBC connection the isClosed() method does not return any status information on the physical connection but only returns whether the close() method was called.
Passive negative checks
One way for a managed connection to keep track of possible
connection problems is to monitor exceptions being thrown from the
client runtime to the application. If there is a way for the
ManagedConnection to discern application errors (e.g. a syntax
error in a prepared statement) from communication failures, the managed
connection can assume that it may be faulty if the exception count is
greater than zero. For example, if the resource adapter wraps a JMS
provider, it could reasonably assume that exceptions from methods like send(), publish() etc. indicate
connectivity problems.
Passive positive checks
If it is not possible to passively monitor connection failures,
perhaps it is possible to keep track of when a connection was used
without any problem for the last time. If a connection was not used for
more than say 30 seconds, you could mark that connection as invalid. Of
course there is a risk that the application uses the connection less
often than once every 30 seconds; if that is the case, the expense of
recreating a connection may not be that bad.
Active validity check
Another way is for the managed connection to actively check the
connection validity. If the managed connection uses an Oracle
connection underneath, it could do a select on the DUAL table. However, it is
important that this check is not very time consuming.
Keep an eye on expenses!
Above it was mentioned that the applicaton server will call
the getInvalidConnections()
method periodically. How
often does the application server call this method? The application
server may have a timer thread that will go over all the idle
connections in the connection pool and check to see if they are still
valid. There are some serious problems with this approach if it is the
only time that the application server calls this method: when the
system is processing at or near capacity, the application server will
hardly ever find idle connections in the pool.
That's why application servers typically will call the getInvalidConnections() method before it gives out a connection to the application. A simple but expensive approach is for the application server to call this method every time an application is given to the application. A smarter approach is to do this not more often than every so many seconds, a value that is configurable for the server. This value is chosen based on the expected failure duration. As was mentioned earlier, the expected failure duration is likely greater than 30 seconds. Hence, it makes little sense for the application server to call getInvalidConnections() more often than every 30 seconds.
Keep in mind however that there is no standard on what application servers do, so it is important to make sure that the getInvalidConnections() method is fast on average. If calling an expensive method is the only way to find out if a connection is valid, the managed connection factory could keep track of when it was called last, so that it will not call this expensive method more than every so many seconds. A guess can be made what a reasonable time span is by looking at how expensive the check is, and keeping the timescales of connection failures in mind, again 30 seconds being a reasonable ballpark number.
Desparate measures
If there's really no way for the managed connection to find out
anything about the validity of the connection, it could resort to a
crude but effective workaround: it can set a limit on the lifetime of
the connection, e.g. one minute. Again, this time interval is based on
the timescales of connectivity failures. This will have a small adverse
effect on performance: connections are destroyed and recreated more
often than they need to be. This effect will not be very big however:
most of the time the application can in fact reuse an existing
connection. In the example above with a connection time of 300 ms, the
throughput goes down by 0.5% when the maximum lifetime of a connection
is 1 minute.
Should a connection failure occur, the faulty connection will be reused for less than one minute, so the problem will eventually correct itself. If the application is used continuously, and if the expected downtime is more than one minute, this will not make any difference to the application because during the one minute in which the connection is faulty: the EIS is unavailable anyways.
Complicating factor: transaction enlistment
Resource adapters declare in the ra.xml what level of transactions they support. There are three levels: XATransaction, LocalTransaction and NoTransaction. If a resource adapter supports XATtransaction, this means that the resource adapter supports XA; the application srver will call getXAResource() on the ManagedConnection to get hold of the XAResource object to control the transaction. Resource adapters that support LocalTransaction return an instance of the LocalTransaction class when the application server calls getLocalTransaction() on the ManagedConnection. This interface has methods begin(), commit() and rollback(). Resource adapters that only support NoTransaction don’t participate at all in transactions.
If a resource adapter supports XATransaction,
the managed connection will have to be enlisted each time for every
transaction. The transaction manager in the application server will
call start() on the XAResource. The start() call is the very first
method that the application server calls on the managed connection
after getting it out of the pool. The start() method will typically
call into theEIS, causing an exception if the exception is faulty. The
best way of dealing with this is for the application server to discard
the connection, i.e. call destroy()
and remove the connection from the pool. Some application servers (e.g.
the Integration Server in Java CAPS) do that. So for these application
servers it may suffice to do nothing in your resource adapter and still
avoid poisoned connection pools. However, there are plenty of other
application servers that will propagate the exception to the
application and return the connection to the pool. And you do want your
resource adapter to work well with any application server, don't you?
For application server that don't destroy connections when the
enlistment fails, it is critical that the resource adapter has to
provide for a fault detection strategy. Unfortunately, an exception on
the start() method is
difficult to
detect for most resource adapters, because resource adapters often
expose the XAResource from the client runtime directly to the
application server's transaction manager. There's good reason for this,
because there's an inherent problem with XAResource wrappers as I noted
in my previous blog. In these situations the passive positive check as
I outlined above may be useful.
Conclusion
When developing a resource adapter, it's crucial to provide for
connection failure detection. Keep in mind:
- different application servers behave differently, e.g. different
frequency of calling getInvalidConnections(),
different behavior when the enlistment of a connection fails
- transaction enlistment failures may be the only failures that occur; can you detect them?
- There are different ways of guessing if a connection is valid, even if the monitoring of failures doesn't work:
- track when a connection was used without failure
- assign a maximum lifetime
- Keep an eye on expenses! Make sure that connections are not recreated every time, and make sure that active health checks don't happen too often.
- how long it takes to create a new connection
- how long a typical connection failure lasts
- how many requests an application is likely to process per second
Posted at
10:33PM Jul 31, 2006
by Frank Kieviet in Sun |
Comments[6]
Permalink: http://blogs.sun.com/fkieviet/entry/j2ee_jca_resource_adapters_poisonous
J2EE JCA Resource Adapters: The problem with XAResource wrappers
Let's say that you're writing an
outbound JCA Resource Adapter. Let's say that it supports XA. Let's say
that you would need to know when the transaction is committed. You
would be tempted to provide a wrapper around the "native" XAResource.
If you are, read on: there are some problems you need to consider
before doing that!
Warning: technical warning alert! The remainder of this posting is full
of technical terms like XAResource and ManagedConnection.
First let me explain in more detail what I am talking about.
Introduction
A client runtime typically is a library that takes care of the communication between a Java client and a server. For instance, a JMS client runtime is one or more jars that implement the JMS api and takes care of the communication with the JMS server. Likewise, a JDBC client runtime library implements the JDBC api to provide connectivity with a database server.
Many adapters that support XA either wrap around or build on top of an existing client runtime. For example, a JMS resource adapter typically wraps around an existing JMS client runtime. A workflow engine adapter may internally use a JDBC connection for persistence so it can be said that it builds on top of a JDBC client runtime.
How does the native XAResource fit in with the JCA container? When the application server requests
the XAResource from the managed connection through the
getXAResource() call, the managed connection may return its own
implementation of the XAResource object, or it may return the
XAResource implemented by the client runtime (the "native" XAResource). The former type is
essentially a wrapper around the XAResource implemented by the client
runtime.
Why is this important? Often it is necessary for a
managed connection to be notified of the progress of a transaction: a
managed connection may need to update its state after the transaction
has been committed or rolled back. The JCA spec does not provide a
standard way of doing this other than through the XAResource. This
may invite you (the developer of the adapter) to write a wrapper around the
XAResource instead of exposing the XAResource of the underlying
client runtime directly.
There are some problems associated with the
wrapper-approach, which will next be discussed in detail.
How should isSameRM() be implemented?
The isSameRM() method is called by the transaction manager to find out if two XAResource-s use the same underlying resource manager. If this is the case, instead of creating a new transaction branch, the transaction manager will join the second XAResource into the same transaction branch.The method isSameRM() can be implemented as follows:
class WrappedXAResource implements XAResource {
private XAResource delegate;
public boolean isSameRM(XAResource other) {
if (other instanceof WrappedXAResource) {
return delegate.isSameRM(other.delegate);
} else {
return delegate.isSameRM(other);
}
}
}
Let's look at a scenario where there are three resources to be enlisted in the same transaction. Two resources belong to the same resource adapter (say W1 and W3), and the other resource belongs to an unknown entity, say R2. Let's assume that the the underlying resource manager is the same. This can happen for example when the resource adapter builds on top of JDBC driver, and the other entity is in fact a database connection to the same database.
Let's say that the application server enlists the resources in this order in the transaction: W1, R2, W3. The transaction manager may call the isSameRM() method as follows:
Case A
- enlist W1:
- enlist R2:
- W1.isSameRM(R2); // returns true; R2 is joined into W1
- enlist R3:
- W1.isSameRM(W3); // returns true; W3 is joined into W1
In this case, all resources are joined, i.e. one branch with W1 receiving the prepare/commit/rollback calls.
Alternatively, the transaction manager may invoke the isSameRM() call as follows:
Case B
- enlist W1:
- enlist R2:
- R2.isSameRM(W1); // returns false
- enlist R3:
- W3.isSameRM(W1); // returns true; W3 is joined with W1
In this case there will be two transaction branches with W1 and R2 receiving both the prepare/commit/rollback calls.
Exactly how the transaction manager invokes the isSameRM() method depends on the implementation of the transaction manager and may be differ from one implementation to another.
Now let's look at what happens if the resources happen to be enlisted in this order: R1, W2, W3
Case C
- enlist R1
- enlist W2
- R1.isSameRM(W2); // returns false
- enlsit W3
- R1.isSameRM(W3); // returns false
- W2.isSameRM(W3); // returns true; W3 is joined into W2
In this case there will be two branches with R1 and W2 receiving prepare/commit/rollback calls
Case D
- enlist R1
- enlist W2
- W2.isSameRM(R1); // returns true; W2 is joined into R1
- enlist W3
- W3.isSameRM(W1); // returns true; W3 is joined into R1
This case results in one transaction branch with R1 receiving the prepare/commit/rollback calls, and W2 or W3 receiving none.
To avoid case D where none of the wrappers receive prepare/commit/rollback calls, the implementation of isSameRM() should only consider other wrappers, and never consider an unwrapped XAResource:
public boolean isSameRM(XAResource other) {
if (other instanceof WrappedXAResource) {
return delegate.isSameRM(other.delegate);
} else {
return false;
}
}
This will also take care of the intransitive behavior of isSameRM() where W1.isSameRM(R2) returns true, while R2.isSameRM(W1) returns false.
Note that if multiple wrappers are
joined together, only one wrapper will receive the
prepare/commit/rollback calls. It is possible to keep track of all
resources that are joined together, but this code becomes rather
complicated although feasible. A simpler approach is to always return
false in the isSameRM() method:
public boolean isSameRM(XAResource other) {
return false;
}
The obvious drawback is that this will result in more transaction branches and will be more expensive.
There's another complication that may result in the wrapper not getting any commit/rollback calls. This has to do with optimizations in the resource manager.
XAResource.prepare()
If R1 and W2 are really using the same resource manager, but the isSameRM() call returned false, there will be two transaction branches from the perspective of the transaction manager. The underlying resource manager however will see two branches of the same with the same global transaction id. The resource manager may then decide to join these two branches together internally. The result is that when the transaction manager calls XAResource.prepare() on W2, the underlying XAResource may return XA_RDONLY. If the tranaction manager receives this signal, it should not call commit() or rollback() on that resource.
The wrapper can provide more code to deal with this situation: instead of delegating the call to prepare() to the underlying XAResource and returning the return value to the caller (the transaction manager), the wrapper should make sure that it will never return XA_RDONLY. It should store this fact in its internal state, so that when the transaction manager calls commit() or rollback(), the wrapper will check if it had overruled XA_RDONLY and not call commit() or rollback() on the underlying XAResource.
The expense of having multiple branches
The performance difference between a transaction with a single branch and a transaction with two branches is enormous. In the case of a single branch, the transaction manager can skip the call to prepare() and only needs to call commit(onephase=true). The transaction manager does not need to log any state to its transaction log. Any write operation to the disk, both by the underlying resource manager and the transaction manager writing to the transaction log is expensive. This is because to be able to guarantee transactional integrity, the write-operations will have to guarantee that the data is in fact on the disk, and not in some write cache. This is done by “syncing” the data to disk. This is an expensive operation; even a fast hard drive can not sync to the disk faster than say 100 times per second. So, changing a single branch transaction to a transaction with two branches, is in fact very expensive.
An alternative
Instead of using wrappers around the XAResource, it's also possible to register interest in the outcome of the transaction by registering a javax.transaction.Synchronization object with the transaction manager. This interface declares two methods: beforeCompletion() and afterCompletion(). The latter takes an argument to indicate if the transaction was committed or rolled back.
The Synchronization object needs to be registered with the javax.transaction.Transaction object using the registerSynchronization(Synchronization sync) method; this object can be obtained from the javax.transaction.TransactionManager object using the getTransaction() method. The question is how to obtain a handle to the TransactionManager. This is not specified in the J2EE spec, and different application servers make the transaction manager available in different ways. As it turns out, most application servers bind the transaction manager in JNDI and for a few others, some extra code is necessary to invoke some methods on some classes. A notable exception is IBM WebSphere that does not provide access to the javax.transaction interfaces, but provides its own proprietary interfaces. However, with some extra code, the same behavior can be obtained. The bottom line is that it is doable to develop some code that can register a Synchronization object on all current application servers.
The approach using a Synchronization object does not suffer from the performance penalty of causing multiple transaction branches when only one would suffice. Hence, this is a better alternative than using wrappers.
Posted at
04:35PM Jul 23, 2006
by Frank Kieviet in Sun |
Comments[3]
Permalink: http://blogs.sun.com/fkieviet/entry/j2ee_jca_resource_adapters_the
Resource adapters at JavaOne
At JavaOne 2006 I gave the presentation "Developing J2EE Connector Architecture Resource Adapters". I did that together with Sivakumar Thyagarajan. He works in Sun's Bangalore office.That he works there, while I work in Monrovia California... that's one of the interesting things about working at Sun: it's very international. Sun has people in all corners of the world. Last week I was on a phone call with perhaps 20 other people, and 12 of them were from other countries.
I talked to Sivakumart a few times on the phone before JavaOne, but met him only face to face for the first time at JavaOne. That's also one of the nice things about JavaOne: you get to meet people face to face you otherwise only talk with on the phone.
The presentation went pretty well, and the audience agreed: at the end of the presentation all audience members can fill in an evaluation form. Here are the feedback results
| | Overall quality | Speakers |
| Our presentation | 4.32 | 4.28 |
| Average at JavaOne | 3.99 | 3.90 |
If you're interested, here's some more information:
JavaOne session information of "Developing J2EE Connector Architecture Resource Adapters"
Slide presentation "Developing J2EE Connector Architecture Resource Adapters"
I've recorded the presentation on my MP3 player:
Audio presentation of "Developing J2EE Connector Architecture Resource Adapters" (.wav)
Audio presentation of "Developing J2EE Connector Architecture Resource Adapters" (.mp3)
Posted at
10:43PM Jul 22, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/resource_adapters_at_javaone
Next: JCA Resource adapters
At Sun I'm responsible for the application server and the JMS server that are shipping as part Java CAPS. As such I've been involved quite a bit in resource adapters (Java Connector Architecture, or JCA).All this serves as an introduction to some more technical blogs on resource adapters.
Posted at
09:59PM Jul 22, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/next_jca_resource_adapters
Automatic log translation
Why would I want to translate logs from one locale to another?Say that I were to build a product, and I do a nice job to make it internationalizable. The product is a success, and it is localized into various languages. Next, a customer in a far-away place sends an email complaining about the server failing to start up. For my convenience, he attached the log file to it. Since I did a good job writing decent error and logging messages, I expect it to be no problem to diagnose what's going wrong. But oops, since I did such a nice job internationalizing the product, the log is in some far-away language, say Japanese! Now what?
Let me give exemplify the example. Let's say that the log contains this entry:
(F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding metLooks foreign to you? (It doesn't to me, but I would have great problems if this example were in Japanese).
server 35 kon niet tot stand gebracht worden.
Fortunately, through the error codes, we could make an attempt to figure out what it says by looking up the error codes. However, if there are many log entries, this becomes a laborious affair. Would it be possible to obtain an English log file without tedious lookups and guess work?
I think there are a few different approaches to this problem:
- always create two log files: an English one and a localized one.
- store the log in a language-neutral form, and use log viewers that render the log in a localized form
- try to automatically "translate" the localized log file into English
try {
...
throw new Exception(msgcat.get("F2110: Could not establish connection with server {0}", serverid));
} catch (Exception e) {
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
}
The exception message is already localized when it is thrown. E.g. in
the catch-block, there is already no language neutral message anymore.
It would have been nice if there were a close cousin to the Object.toString() method: one that takes the locale: toString(Locale) and if the Exception class would take an Object instead of limiting itself to a String.In a previous product where I had more control over the complete codebase, I approaches this problem by introducing a specialized text class that supported the toString(Locale) method, and Exception classes that could return this text class. This solution was also ideal for storing text in locale-neutral form in a database, so that different customers could view the data in different locales.
There is a kludgy work-around: we could change the msg.get() method so that it returns a String that is locale neutral rather than localized. A separate method would convert the locale neutral String into a localized String, e.g. msg.convert(String, Locale). This method would have to be called any time a String would be prepared for viewing, e.g. in the logging for a localized log.
In the products that I am currently working on, these approaches to support locale-neutral strings are not feasible because they would require widespread. So let's take a look at option 3.
Given the resource bundle
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
F2110 = Een verbinding met de server {0} kon niet tot stand gebracht worden.
andF1231 = Could not open file {1} in directory {0}: {2}
F2110 = Could not establish connection with server {0}let's see if there is a way to automatically translate the message (F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding metinto
server 35 kon niet tot stand gebracht worden.
(F1231) Could not open file bestelling-554 in directory bestellingen: (F2110) Could not establish aI think it is possible to build a tool that can do that. The tool would read in all known resource bundles (possibly by pointing it to the installation image, after which the tool would scan all jars to exttact all resource bundles), and translate them into regular expressions. It would have to be able to recognize error codes (e.g. \([A..Z]dddd\) ) and use these to successively expand the error message into its full locale neutral form. In the example, the neutral form is:
connection with server 35.
[F1231, {0}=bestellingen, {1}=bestelling-554, {2}=[F2110, {0}=35]]
The neutral form then can be easily converted into the localized English form.
Posted at
05:50PM Jul 16, 2006
by Frank Kieviet in Sun |
Comments[4]
Permalink: http://blogs.sun.com/fkieviet/entry/log_relocalization
Internationalization of logging and error messages for the server side (cont'd)
In Thursday's entry I proposed that a tool is needed to make it easier to internationalize sources where the English error messages are kept in the source file, and the foreign language messages are in resource bundles.Let's talk about this tool. There are a few different approaches to go about this tool. As Tim Foster remarked in his comments, it's possible to parse the source code. This approach is doable, especially when existing tools are used (Tim mentioned http://open-language-tools.dev.java.net.
Another approach is to parse the compiled byte code. Using tools like BCEL, it's fairly simple to read a .class file, and extract all the strings in there. It could easily be run on the finished product: just add some logic to go over the installation image, find all jars, and then iterate over all .class files in the jars.
Fortunately the compiler makes a string that is split up over multiple lines in the source into one:
String msg = msgcat.get("F1231: Could not open file {1}"
+ " in directory {0}: {2}", dir, file, ex);
is found in the .class file as:F1231: Could not open file {1} in directory {0}: {2}
So it's simple to extract strings from a .class file. But how can we discern strings that represent actual error
messages from other strings? Error messages can be discerned from other strings because
they start with an error message number. The error message number
should follow a particular pattern. In the exampleString msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
the pattern (regular expression) is[A-Z]dddd\:Note that a similar trick would have been used when parsing source code, unless some logic is applied to find only those strings thar are used in particular constructs, like calling methods on loggers, or constructors of exceptions. This can quickly become very complicated because often log wrapper classes are used instead of java.util.Loggers.
This is also the answer to a question that I didn't pose yet: in the following code,
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
how does the msgcat object localize messages? It does that by extracting the error code from the message (F1231)
applying the same regular expression, or by splitting the string on the
colon. In either case, it's important to have a convention on how the
error message looks like or is embedded into the message.Next problem: how to re-localize an existing log so that an American support engineer can read a log from his product that was created on a Japanese system?
Posted at
03:45PM Jul 15, 2006
by Frank Kieviet in Sun |
Comments[1]
Permalink: http://blogs.sun.com/fkieviet/entry/internationalization_of_logging_and_error2
Internationalization of logging and error messages for the server side (cont'd)
I ended my previous blog with "Isn't there a better way"? Well...
how would I like to use error messages in Java
source code? I would simply want to write something like this:
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);The advantages are clear:- No switching to a different file to add the error message
- The error message is visible right there in the source code -- easy to review!
- It's easy to see that the arguments in {0}, {1} are correct
How is this source file internationalized? We need a tool! The tool will
- locate all error messages in the code base
- generate properties files for all desired languages if they don't exist
- print out a list of all properties files that need to be localized
msgs_en_US.propertiesand
# AUTOMATICALLY GENERATED# DO NOT EDIT
# com.stc.jms.server.SegmentMgr
F1231 = Could not open file {1} in directory {0}: {2}
msgs_nl_NL.properties
# com.stc.jms.server.SegmentMgr
# F1231 = Could not open file {1} in directory {0}: {2}
# ;;TODO:NEW MESSAGE;;F1231 =
so that a human translator can easily add the translations to the foreign properties files. As you can see, it includes the location (Java class) where the message was encountered.
msgs_nl_NL.propertiesOfcourse the tool would have a way of handling if you would change the error message in the source code. The translated properties file would look like this:
# com.stc.jms.server.SegmentMgr
# F1231 = Could not open file {1} in directory {0}: {2}
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
msgs_nl_NL.propertiesAny ideas on how to build this tool?
# com.stc.jms.server.SegmentMgr
# F1231 = File {1} in directory {0} could not be opened: {2}
# F1231 = Could not open file {1} in directory {0}: {2}
# ;;TODO: MESSAGE CHANGED;;
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}
Posted at
11:30PM Jul 13, 2006
by Frank Kieviet in Sun |
Comments[2]
Permalink: http://blogs.sun.com/fkieviet/entry/internationalization_of_logging_and_error1
Internationalization of logging and error messages for the server side
This is how one would throw an internationalizable message:String msg = msgcat.get("EOpenFile", dir, file, ex);
throw new IOException(msg, ex);
The localized message is typically in a .properties file, e.g.EOpenFile=F1231: Could not open file {1} in directory {0}: {2}
Each language has its own .properties file. The msgcat class is a utility class that loads the .properties file. Logging messages to a log file typically uses the same constructs.Looks cool, right? So what's my gripe?
- when coding this, you would have to update .properties file in addition to the .java file you're working on
- It's easy to make a typo in the message identifier, EOpenFile in the example above; there is no compile time checking for these "constants".
- It's difficult to check that the right parameters are used in the right order ({0}, {1}) etc.
- When reviewing the .java file, it's difficult to check that the error message is used in the right context and that the error message is meaningful in the context.
- When reviewing the .properties files, it's difficult to determine where these error messages are used (if at all!) -- you can only find out through a full text search
Posted at
10:26PM Jul 10, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/internationalization_of_logging_and_error
Hello world
Hello world... this is my first ever blog. So, who am I?
I came to Sun through the acquisition of SeeBeyond. There are quite
a lot of differences between SeeBeyond and Sun.
SeeBeyond was a bit of a secretive company. Contact with the "outside
world" was not exactly stimulated, rather frowned upon. Sun on the
other hand is all for openness and communication with the world outside
of Sun. Hence this blog.
What can I tell about my time before Sun and SeeBeyond? Well, to begin at the beginning: a defining moment in my life was when I took a programming class in high school. As soon as I could, I bought a Commodore 64 on which I toiled countless hours away coding tools in the immensely primitive BASIC of that machine. Ever since, my one great interest has been developing software.
For the past ten-or-so years, I have been involved in development of
commercial software products. Early on in my career I was in software
for the chemical process industry (my background is really chemical
engineering); soon I discovered that I should not limit myself to chemical
engineering and I moved into enterprise software. First into B2B software, then
into integration software.
In a previous life I spent five years working at the Eindhoven University of Technology
in the department of Chemical Engineering. Much of it was in
compuational modelling and laboratory automation; I had a tiny softare
company on the side for the evening hours. After I earned my PhD I left
for the USA.
What will I do with this blog? No personal musings... but I hope to write plenty of technical ones.
Posted at
09:59PM Jul 09, 2006
by Frank Kieviet in Sun |
Comments[0]
Permalink: http://blogs.sun.com/fkieviet/entry/hello_world
Tuesday Mar 17, 2009
