Earthly Powers
- All
- Fast Infoset
- General
- Java
- REST
Naming, buying and reading
The title of this blog may be misconstrued as a pithy philosophical statement about life, the universe and everything.
Naming things can be tricky and inspiration comes from many sources.
While pondering a name for this blog a reoccurring thought distracted my consciousness yet again: "I must buy and read Earthly Powers by Anthony Burgess!". Free association did the rest, and the title would serve a duel purpose as a memento.
This week my copy of Earthly Powers arrived in the post.

Now i have to remind myself to read it, and an opportunity presents itself. I am on holiday from the 8th September for 2 weeks. This is a big tome, 650 pages of small print, requiring sustained concentration and time, both of which are a scarce commodity when an energetic toddler demands a 100% of both. So... we shall see...
Posted at 02:25PM Sep 05, 2009 by Paul Sandoz in General | Comments[0]
Concurrency with Scala (was Concurrency with Erlang)
Steve Vinosky has written some really interesting articles on Erlang concurrency and Erlang reliability. I think i am going to buy that book on Erlang he recommends.
After i read his article on concurrency i wondered if the same (contrived but instructive) example he presents could be written in Scala and how it would compare, if anything it is a good exercise in learning.
Here is the simple non-concurrent (tail call) recursive pow function in Scala:
def pow(n: int, m: int): BigInt = {
def _pow(m: int, acc: BigInt): BigInt = m match {
case 0 => acc
case _ => _pow(m - 1, acc * n)
}
_pow(m, 1)
}
and here is the concurrent cpow function using Scala's event-based actors:
def cpow(n: int, m: int): BigInt = {
val actors = for (_ <- List.range(0, m)) yield
actor { react { case x: int => sender ! x } }
actors foreach ( a => a ! n )
(actors foldLeft BigInt(1)) { (t , a) => self ? match { case x: int => t * x } }
}
The value actors, which is a list of m actors, is obtained using list comprehension. As in Steve's example each actor just pings back the value it receives from the sender. Then a message is sent to each actor with the value of n. Finally all the values received from the actors are multipled together (as BigInt types) using the foldLeft function. Rather elegant! and similar in size to the Erlang code.
I measured the time it took for both functions to calculate 502000 on my Acer Ferrari 3400 laptop running Solaris, Java SE 5, and Scala 2.6.0. The pow function took 17.86 ms and the cpow function took 81.28 ms.
Below I re-iterate one of Steve's final points with a slight modification:
..., many developers are comfortable with OO programming. I’d like to advise such developers not to let Erlang’s or Scala's functional nature scare you away, ...
Given that Scala is a hybrid object-oriented and
functional programming language it can enable developers to smoothly transition between the imperative and functional programming styles in small manageable steps i.e. Scala is less scary than it might initially appear.
Posted at 03:21PM Nov 01, 2007 by Paul Sandoz in General | Comments[12]
Skittr cloning Twitter
Skittr is a clone of Twitter written by David Pollak in 884 lines of Scala. It uses the lift web framework. What caught my eye was the following:
It can handle 1M+ users on a two Intel Core 2 Duo boxes (one as the web front end and the other as the message broker.) With a weeks of full-time coding, it could be modified to scale linearly with the number of boxes in the system.
Impressive scalability in conjunction with the Java VM and a Servlet engine.
A side question: can functional programming languages help programmers better develop scalable applications for multi-core chips? Cause at the moment there is an issue that is nicely expressed by Sean McGrath:
The processor will stop doubling in speed and halving in cost. Instead, you will find more and more processors shipping in each computer.
This is the future because the hardware people are creating it that way. The software people need to realize that fact and start figuring out how to use all the processors. This future does not just involve just re-compiling your software. It involves turning it on its head in most cases.
Posted at 11:42AM Jun 08, 2007 by Paul Sandoz in General | Comments[4]
Web of Services for Enterprise Computing workshop
I attended, with Marc, the Web of Services for Enterprise Computing workshop. It was a useful two days hearing various positions put forward.
WADL was very well received, I think it is gaining some traction. It would be fair to say that WSDL 2.0 was not so well received in the context of being Web friendly.
What surprised me was the general acceptance of the Web, and REST architectural style, as a good thing for building Web applications. I guess i was expecting a little more heat from opposing sides! My impression was that the main opposing positions were 'please stop doing more specs and maintain what we have' and 'we need more specs'. It is clear that there is a general need for interoperable protocols in the enterprise but some were questioning whether this should be something the W3C should really be involved with.
My personal view is that the W3C should concentrate WS-* interoperability on what we have (limit new specifications), encourage good and best practices for applications that are exposed on the Web using the REST architectural style, fix the pain points for Web interoperability and bridge the gap between Web and WS-* where it makes sense.
With respect to the last view i think Atom and the Atom Publishing Protocol (APP) has the potential to make in-roads into the enterprise where real-time requirements are less demanding. An Atom service combined with something like the Post Once Exactly (and PUT too?) pattern of use may provide reliable services in a manner quite different to that of traditional enterprise publish and subscribe technologies but yet be Web friendly. For instance, imagine an APP facade over a JMS queue or topic.
Posted at 02:00PM Mar 05, 2007 by Paul Sandoz in General | Comments[0]
ITU-T Recommendations are freely available
The ITU-T has started the new year with a great gesture of good will.
All published ITU-T Recommendations are currently freely available from here. The ITU-T also offers such Recommendations in languages other than English, such as French/Spanish and Chinese/Arabic for more recent Recommendations . In my experience the editors at the ITU-T do a great service providing high-quality Recommendations and ITU-T's commitment to multiple languages is truly commendable.
The free access will last at least until the next ITU Council meeting (the beginning of September 2007).
Unfortunately Fast Infoset (X.891) is still in the pre-published state so it is not possible to get it directly for free (but one can download it using for free using the 3 free Recommendation option). Fast Web Services (X.892) is available. It is interesting looking at a specification i helped write in Arabic!
Posted at 01:59PM Jan 08, 2007 by Paul Sandoz in General | Comments[3]
Merging of Japex configuration files
I recently added a new feature to Japex for the supporting of merging configuration files. This was a feature required for a large benchmark to make it easier to handle the testing of multiple groups of tests cases without having to create a new configuration file for each group of test cases. ( This benchmark has been used to produce results for the W3C Efficient XML Interchange Working Group.)
Now you can pass more than one configuration file as arguments to japex when the '-merge' option is present, for example:
japex.sh -merge <file_1> ... <file_n>
and Japex will merge each file to produce one test suite configuration. For example, there could be one configuration file consisting of parameters and drivers and one or more configuration files consisting of test cases and Japex could be used like this:
japex.sh -merge configWithDrivers.xml testCasesGroup1.xml
I modified the Japex configuration schema so that param, driver, testCase and groups of are all root elements. This means that a group of test cases is a valid standalone Japex configuration file and it allows JAXB to easily unmarshall such configuration files.
In fact JAXB made it really easy to merge the configuration files since it was simply a matter of copying information from one JAXB bean to another one. Now JAXB is also used to create the configuration file that is stored with the Japex reports: it is just a simple marshal of the JAXB bean representing the test suite.
Posted at 02:45PM Aug 25, 2006 by Paul Sandoz in General | Comments[0]
A ramble on characters in XML documents
When I was rookie XML 1.0 user i was not aware that there were restrictions in the characters that are allowed in element/attribute tag names, text content and attribute values. It caused some mild eyebrow raises when i found out and looked more closely at the W3C XML 1.0 Recommendation! XML's foundation is Unicode characters, right? so why the subset?
Take for example the specified character range of a character that is allowed as part of text content:
Char | ::= | #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] |
So that means no 'control' characters, like 'NULL' or 'BELL' (see here for good description of the issues , which may explain the reasons why, for XML 1.0, control characters were disallowed). A character code of '0' is not allowed as part of text content of an XML document, i think this makes sense from the perspective C/C++ since '0' is used as terminator for strings, and allowing '0' would cause all sorts of issues.
So the following XML document is not well-formed:
<element>�</element>
For more information on this I highly recommend looking at Tim Bray's most excellent annotated XML 1.0.
Having said that, the W3C XML 1.1 Recommendation opened the door for 'control' characters!, the character range of a character is now:
Char | ::= | [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] |
The 'NULL' character code is still disallowed. I think XML 1.1 is an improvement on XML 1.0 since XML 1.0 restricted the characters codes that were allowed in element/attribute tag names. Now more languages can utilize markup for element/attribute names.
Posted at 12:15PM Jun 05, 2006 by Paul Sandoz in General | Comments[0]
Example of a plugable transport in JAX-WS
The rearchitectured JAX-WS RI supports plugable transport components, so it is possible to transmit SOAP messages over transports other than HTTP, for example JMS, SMTP or TCP/IP.
One such example explored in more detail is a plugable JMS transport. Alexey Stashok has created a example JMS Web service transport component at java.net. This simple example is specifically designed to show how transports can be easily developed.
If you are interested in developing transports for JAX-WS check this out!
Posted at 04:08AM May 16, 2006 by Paul Sandoz in General | Comments[2]
Grouping of test cases in Japex
I mentioned to Santiago that it would be useful to group test cases in Japex so that they could share common parameters instead of having to repeat common parameters per test case. A day later he implemented it!
So now you can do the following (which is an actual fragment from a real test case we use for measuring XML and JAXB performance):
value="data/Invoice/instance/inv1.xml"/>
value="data/Invoice/instance/inv10.xml"/>
value="data/Invoice/instance/inv100.xml"/>
Before being able to group tests it was necessary set the contextPath parameter on each test case (this parameter is passed to a JAXBContext).
On the subject of reuse a useful tip when using Japex is to separate the test cases from the configuration file (with the drivers) by using entities. The following is an example Japex configuration file used by the XMLStreamBuffer project (see here) for measuring the performance of creating a buffer compared to creating a DOM:
<!DOCTYPE project [
<!ENTITY testCases SYSTEM "testcases.xml">
]>
<testSuite name="parse" xmlns="http://www.sun.com/japex/testSuite">
<param name="japex.warmupTime" value="5"/>
<param name="japex.runTime" value="5"/>
<param name="japex.resultUnit" value="ms"/>
<driver name="XercesJAXPSAXDriver" normal="true">
<param name="japex.driverClass"
value="com.sun.japex.jdsl.xml.parsing.sax.XercesJAXPSAXDriver"/>
<param name="jdsl.doNotReportSize" value="true"/>
</driver>
<driver name="JAXPSAXParserCreatorDriver">
<param name="japex.driverClass"
value="com.sun.xml.stream.buffer.japex.SAXParserCreatorDriver"/>
</driver>
<driver name="XercesJAXPDOMParser">
<param name="japex.driverClass"
value="com.sun.japex.jdsl.xml.parsing.dom.XercesJAXPDOMDriver"/>
<param name="jdsl.deferNodeExpansion" value="false"/>
<param name="jdsl.doNotReportSize" value="true"/>
</driver>
&testCases;
</testSuite>
When the XML document is parsed the parser will substitute &testCases for the contents of the file testcases.xml.
In fact now that test groups are supported Japex is much more friendly to such composibility using XML Include.
Posted at 06:40PM Mar 13, 2006 by Paul Sandoz in General | Comments[0]
Optimizations to JAX-WS
Kohsuke writes how he is neck deep in JAX-WS rearchitecturing the JAX-WS RI. He deinfintely is, as are others working on JAX-WS, but they are having no problems holding their heads above water!
There are some good improvements here on architecture and implementation which i think will make it very competetive in terms of performance and features with other Web service stacks.
I have looked at netbeans profiler traces (obtained from using Japex configured for a simple Web service micro-benchmark called WSpex) of the old and new and I could see an immediate improvement in the stack of calls. It was also much easier to identify hotspots where performance can be improved. There is still work to do but early results obtained using Japex indicate we are off to a very promising start.
One area which i have been trying to help out in is the area of efficiently buffering infoset for efficient replay. The XMLStreamBuffer project enables the buffering of infoset produced from stream-based XML APIs (like SAX or StAX) for replay using stream-based XML APIs.
XMLStreamBuffer is being used in JAX-WS to buffer the SOAP header blocks of a SOAP message. Each SOAP header block is marked in the buffer and can be replayed using a XMLStreamReader or easily bound to a JAXB object. This offers an efficient alternative (in terms of memory and processing performance) to buffering the header blocks using DOM.
Another area in JAX-WS where XMLStreamBuffer is currently being used is in the area of processing documents (e.g. WSDL, XSD, etc.) associated with services. Buffers of documents are created and consequently they may be efficiently processed concurrently.
I think this calls for another blog on XMLStreamBuffer explaining features and presenting some examples...
Posted at 01:01PM Feb 15, 2006 by Paul Sandoz in General | Comments[0]
Japex has moved
Japex, the performance monitoring framework, has found a new home and a cool logo for new year.
It was previously part of the Fast Infoset project and Santiago moved it and created the nice new home.
Posted at 06:55PM Jan 02, 2006 by Paul Sandoz in General | Comments[0]
Prophet warning
Nick Cohen, a reporter in the Observer, writes some good pieces in his "Without prejudice" column, and the one for New Year is one of the best.
This being the new year and all prophets are makeing themselves heard. So.... a prophet should be judged:
- by the accuracy of their previous forecasts;
- how well their beliefs reflect observable reality; and
- if they update their beliefs in response to new evidence.
Posted at 06:48PM Jan 02, 2006 by Paul Sandoz in General | Comments[0]
Rich Turner on Binary XML in WCF
Rich Tuner, a product manager in Microsoft's Web Services Strategy team, writes:
"In our own work, we see WCF's text-XML encoder significantly outperform our current ASMX/WSE text-XML encoder engine. We also see the WCF BinaryXML serializer delivering hige performance and throughput improvements vs. serializing raw text-XML. Right now, the WCF BinaryXML format is proprietary because of the absence of a standard BinaryXML format. However, once the world agrees on one (or maybe more) BinaryXML formats, you can be sure that we or someone else will ship a compliant encoder for WCF.
Roll on, that day!"
Well, i think that day is here and now. Fast Infoset is a standard binary encoding of the XML Information Set that is jointly standardized in ITU-T and ISO.
Maybe someone will implement Fast Infoset for WCF? Then JWSDP 1.6 and JAX-WS 2.0
clients/services could interoperate with WCF clients/services
using a standard binary encoding of the XML information set.
As a start one could look at the open source Java-based implementation here.
Posted at 04:16PM Dec 08, 2005 by Paul Sandoz in General | Comments[0]
Fast Infoset and WCF's binary XML encoding
When browsing Microsoft's WCF APIs i came across some interesting information on the WCF "binary XML" format.
XmlReader
is a pull-based API for processing an XML infoset. The same API can be
used for processing XML documents or "binary XML" documents. How?
The static XmlReader.Create
method can take as input a Stream, of octets, that is an XML document
or a "binary XML" document. The documentation of this method states:
Here is the bit i find interesting. Fast Infoset specifies that the first two octets of a fast infoset
document are 0xE0 and 0x00. There is only a difference of 1 between the
two!! (for an 8 bit integer or for a 16-bit integer, most significant
byte first).
Given that the two binary formats use different "magic numbers" it
should be possibly to integrate Fast Infoset into WCF without any
conflicts :-)
The first two octets of a fast infoset document were chosen because
they are different from the first two octets that can occur for an XML
document encoded using a well-known character encoding scheme (see Appendix F of XML 1.0).
We have used the same type of factory mechanism based on the first couple of octets when prototyping
solutions for the Java Web Services Developer Pack. For the final
integration into JWSDP 1.6 we chose to rely on the MIME type instead.
Of further note is the XmlReader and XmlWriter have specific
methods to read/write binary data, like octets or integers. When using
the text-based implementations the data will be converted from/to characters
in accordance with the lexical representations of data types specified by W3C XML
Schema. But, when using a binary-based implementation such data could be
encoded much more efficiently.
Having such methods on the XmlReader and XmlWriter is rather useful
IMHO. Not only is it very useful aid for developers, it makes
integration of special optimized binary encodings of data quite easy
while hiding the implementation details from the developer.
Posted at 02:12PM Nov 29, 2005 by Paul Sandoz in General | Comments[0]