Gopalan Suresh Raj
Web Cornucopia
Gopalan's Profile
Archives
« October 2008
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today
Click me to subscribe Download Open ESB
Syndication
Search

Table of Contents
Tags
bpel choreography ejb esb http integration javacaps javaee javaone jax-ws jbi management openesb orchestration process-oriented rest sca service-oriented soa soap wsdl xml xsd
Links
 
Referrers

Today's Page Hits: 249

Map of Visitors
Locations of visitors to this page
Caveat Emptor
This is my personal weblog. The contents of this Weblog represent my personal opinion which may differ from the official views of my employer, Sun Microsystems, Inc. or any past employers. I do not speak for my employer or any past employers.
View Gopalan Suresh Raj's profile on LinkedIn
« JBI/SOA Tips: Separa... | Main | Open ESB 2.0 Beta is... »
Saturday May 05, 2007
May
5
JBI/SOA Tips: Consider using a StaX-based parser to process huge XML datasets

Choose the right XML parser for your implementation. You have a choice of using either a DOM-based, SAX-based or a StaX-based XML parser.

You could choose a DOM-based parser if you need to modify the XML document structure during runtime or traverse the hierarchical XML tree multiple times since DOM provides access to the complete XML tree as it is a tree-based parser. The downside to DOM is that the entire XML document has to be loaded into memory as a hierarchical object graph which may not work if your data sets are large. Therefore, this may work for small or medium-sized XML datasets.

When you have to deal with large documents with a limited amount of nested elements and you only need to use a subset of the complete XML document at any time, you could consider using a SAX-based parser. In SAX-based parsing, the XML data is read and pieces  of XML fragments of the document are pushed to application code-handlers using events. Therefore SAX is an event-based parser where the parser takes control of the code.

If you need to process huge XML datasets, consider using a StaX-based parser. In StaX-based parsing, the data is pulled from the data stream by the application at its own convenience. It allows your application code to filter, skip tags, or stop parsing at any point in time. This allows your application (not the parser) to be in total control of parsing the data. In StaX-based parsers, application code controls parsing directly by iterating over the document using a stream reader approach much like SAX. However, it only holds a small  part of the XML document in memory at any point in time. Having the advantages of both DOM and SAX-based parsers makes StaX-based parsers the best choice to handle huge documents in a fast and efficient manner.

Like this write-up? Subscribe to receive  more  like it.


 

Posted at 11:45PM May 05, 2007 by Gopalan Raj in A Tip a Day  |  Listen to this article Listen to this entry  |  Comments added Comments[0]

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed

Disclaimer: The contents of this Weblog represent my personal opinion which may differ from the official views of my employer, Sun Microsystems, Inc. or any past employers.



View blog top tags

Enter your email address:

Delivered by FeedBurner

[Valid RSS]