Earthly Powers
- All
- Fast Infoset
- General
- Java
- REST
XML to typed FI utility
I have just created a simple utility and UNIX shell script that will take as input a XML schema and an XML document, and convert that to a fast infoset document.
The XML schema is used to work out XS data types associated with lexical values (text content and attribute values) within the XML document. The lexical values are converted to binary data using a simple mapping between the XS data types and the Fast Infoset encoding algorithms and restricted alphabets.
To use (on a UNIX system. Warning: i have only tested on Solaris) download the latest fast infoset distribution here, and unzip it. Set the FI_HOME and FI_UTILITIES_HOME environment variables to the unziped location of the distribution and include in your path ${FI_HOME}/bin.
As an example, download the latest fast infoset source distirbution here, and unzip it in the ${FI_HOME} directory. Change directory to samples/data. Then type;
xmltosaxtotypedfi schema/Content.xsd content.xml content.fi
This will convert the XML document:
<content xmlns="http://www.sun.com/xml/content">
<base64 value="AAAAAAAAAAAAAAAA">AAAAAAAAAAAAAAAA</base64>
<floats value="3.14159265 2.71828183">3.14159265 2.71828183 1.0</floats>
</content>
to a fast infoset document. The base64 lexical values will be convert to bytes (in this case all zeros) and the arrays of float lexical values will be converted to arrays of IEEE floats.
The octets of the FI document are as follows (using od -A x -tx1a):
0000000 e0 00 00 01 00 38 cd 1d 68 74 74 70 3a 2f 2f 77
` nul nul soh nul 8 M gs h t t p : / / w
0000010 77 77 2e 73 75 6e 2e 63 6f 6d 2f 78 6d 6c 2f 63
w w . s u n . c o m / x m l / c
0000020 6f 6e 74 65 6e 74 f0 3d 81 06 63 6f 6e 74 65 6e
o n t e n t p = soh ack c o n t e n
0000030 74 92 02 0a 20 20 20 20 7d 81 05 62 61 73 65 36
t dc2 stx lf sp sp sp sp } soh enq b a s e 6
0000040 34 78 04 76 61 6c 75 65 30 18 03 00 00 00 00 00
4 x eot v a l u e 0 can etx nul nul nul nul nul
0000050 00 00 00 00 00 00 00 f0 8c 06 09 00 00 00 00 00
nul nul nul nul nul nul nul p ff ack ht nul nul nul nul nul
0000060 00 00 00 00 00 00 00 f0 a0 7d 81 05 66 6c 6f 61
nul nul nul nul nul nul nul p sp } soh enq f l o a
0000070 74 73 00 30 67 40 49 0f db 40 2d f8 54 f0 8c 1a
t s nul 0 g @ I si [ @ - x T p ff sub
0000080 09 40 49 0f db 40 2d f8 54 3f 80 00 00 f0 90 0a
ht @ I si [ @ - x T ? nul nul nul p dle lf
0000090 ff
del
0000091
Notice that you cannot see the lexical values for the base64 or arrays of float.
The fast infoset document can be converted back to an XML document by doing the following:
fitosaxtoxml content.fi
Which will produce the following written to the standard output:
<content xmlns="http://www.sun.com/xml/content">
<base64 value="AAAAAAAAAAAAAAAA">AAAAAAAAAAAAAAAA</base64>
<floats value="3.1415927 2.7182817">3.1415927 2.7182817 1.0</floats>
</content>
Notice that the documents are not identical! The exact lexical values of the floats are not preserved because the conversion of some lexical values to IEEE float is lossy. Also notice that the FI parser will, by default, convert binary data to lexical values, thus ensuring that typed fast infoset documents can always be parsed. Further notice that you do not require the schema to parse the fast infoset document.
Posted at 01:40PM Oct 03, 2006 by Paul Sandoz in Fast Infoset | Comments[2]
Posted by Wilfred Springer on October 06, 2006 at 08:41 AM CEST #
Posted by Paul Sandoz on October 06, 2006 at 09:57 AM CEST #