Sunday Sep 21, 2008
Monday Sep 15, 2008
Mural is an Open Source Community which aims to build the Infrastructure for Master Data Management through its various sub-projects which cater to different aspects of the MDM Life Cycle.

Before going any further let us be clear on what Master Data Management is all about. Products, Customers, Partners ... forms the fundamental vectors of a Business Entity. Its extremely critical and fundamental for a business to be able to answer queries like
- How many unique employees work for the Company?
- Who are the Customers/Partners I am having relationships with and how are they related?
- Complete visibility into the Products
- For a Government, it manifests as Citizen services and complete visibility into Citizen information(Single Citizen View)
- For a HealthCare Network, it manifests as a Single Patient View etc.
- Master Data Management is a discipline backed by technology, tools and processes to provide answers to such queries.
Friday Aug 24, 2007
I have been doing bad job with blogging off-late and thought I will do some catch up. Very reason off-course was probably because I got absorbed into what our team was building here at the Sun Bangalore office. It all started when we realized that the extraction capabilities of the ETL Service Engine has its applications in building virtual view of heterogenous datasources and it fits into the category of Server-Side Datamashup. It felt really good at that point and then no more looking back.
So, we built a service engine called EDM Service engine. Here are the useful links to explore
Community corner talk at JavaOne2007
DataMashup demos showcased @ JavaOne2007
If you are interested in getting this demo up and running, you can get all of it here.
So what are we doing with it now?
We are building Datamashup as a RESTful webservice. Hoping to bring it out soon.
Saturday Mar 17, 2007
ETLSE is a good place to learn more about this project.
ETLSE is a good place to learn more about this product.
Thursday Mar 08, 2007
We were discussing recently why ETL was a Service Engine and why not a Binding component. Would like to post some of the implicit assumptions behind it. First of all, lets see what the spec says about the definition of SE and BC.
• Service Engine (SE). SEs provide business logic and transformation
services to other components, as well
as consume such services. SEs can integrate Java-based applications (and
other resources), or applications
with available Java APIs.
• Binding Component (BC). BCs provide connectivity to services external to a JBI environment. This can involve communications protocols, or services provided by Enterprise Information Systems (EIS resources). BCs can integrate applications (and other resources) that use remote access technology that is not available directly in Java.
The definition that BCs provide connectivity to services external to JBI environment created the argument that "ETL connects to external systems. So it should be rightly called a
BC".Here is the detailed analysis of why it should be an SE and not a BC.
Reason1:
ETL Service qualifies as an SE because it does lot more than connectivity and it also fits well with the definition from the spec as it does Extraction, transformation and Loading Services. *Of Course, it connects to External systems to do the job, but note that that is not the Service its offering.*
Reason2:
About ETL:
---------------
Its a data integration tool. More often such kind of tool is
associated with off line batch processing. The tool can extract data
from heterogeneous source like databases, files and xml documents .
The usual mechanism used to extract and load data in case we are
dealing with databases is using jdbc , while dealing with data in
files we directly read the data from the files. we are still working
on the xml documents.
Note that the tool requires /*more than one protocol or transport*/ in
order to fetch data.
ETL as Service:
-----------------------
How do expose such capability as service. The ETL service engine is a
/*JBI(JSR 208) based service engine component */. This ETL Service
engine exposes such kind of capability as a service, which means that
this can be a part of any composite application.
ETL and JBI:
------------------
The ETL service engine component could have talk to the external
systems in two ways.
1. All access to the data can be done through some binding component
like jdbc binding component or file binding component. So the data
flow would look like
[external-systems]<--------------->[jdbc-bc or
file-bc]<--------------------->[ETL Service engine]
2. ETL service engine can directly access data using jdbc or any other
mechanism instead of using some binding component.
[external-systems]<-------(more than one transport
protocol)---------->[ETL Service engine]
The reason for taking the second approach is performance. More often
ETL tool is involved in extracting and loading ( thousands , even
millions of rows) . Any communication between JBI component is
mediated through NMR ( Normalized message router) . Note in the first
case the communication between the [BC] and [SE] would look like
[jdbc-bc or file-bc]<-----------NMR---------->[ETL Service engine]
Imagine the amount of data the NMR has to handle, when we extract and
load millions of rows, through NMR.
Note in case2 the ETL is */not trying to /**/to isolate the JBI
environment from the particular protocol by providing normalization
and denormalization from and to the protocol-specific format/*, when
it is talking to the outside world, which is actual job of a binding
component. When it is accessing the external systems its is doing some
ETL operations i.e (E and L of ETL)
Acknowledgements: I would like to acknowledge my colleague Sujit Biswas for the Reason2.
Tuesday Dec 26, 2006
Hello World!!!!
I wanted to do something really exciting this New Year. So, here I am creating my first entry. At Sun, I work on the Java CAPS product development. I am also involved in the open-source project ETLSE. My current interests include understanding the way JBI evolves the Data Integration space. I intend to write more about it.
This blog copyright 2008 by srenga