Free Blog Counter

Sunday May 31, 2009

Master Data is the business context data of entities or objects involved in business transaction. MDM helps in creating complete, consolidated views of information. It also helps in customer data integration and product information management. Some of the entity domains where MDM is largely used are:
- Person (Customer, Patient, Subscriber, Citizen, Employee)
- Business (Client, Vendor, Supplier)
- Products (Parts, Products)

Let's take a case study where we need to consolidate customer data. French Bank B is being merged to an American Bank A. The merged banks must find the common or potentially common customers to provide them certain advantages (like free money orders if total deposits exceeds certain total), report them to agencies toom comply with certain government regulations (hiding black money in foreign account). John Green married Mary Davidson when they were in France and they have moved to USA after marriage. They have a daughter Christine. Note that Mary has changed last name after marriage. John has a different address. John and Mary has accounts in Bank B since the time they were in France. They also have account in Bank A now. Christione has account in Bank A. See how the MDM application will identify them:
- John is an assumed match and will be placed in the common customer list.
- Mary is in the same house hold and a potential match list and should be analyzed by bank individuals.
- Christine is in the same house hold.  

MDM market drivers:
  • Proliferation of applications with dispersed data across disparate systems
  • Increasing need to be customer-aware to deliver the right customer experience
  • Just in time business models require agility and integration across the enterprise
  • Ongoing mergers and acquisitions
  • Compliance requirements
  • Localized governmental requirements
  • Focus on data governance - data ownership and data quality

SUN Mural: Open Source Master Data Management is the first and unique of its type. It has the following major components.
Master Index Studio – Provides the capability to create any domain-specific master index as we described above.
Data Integrator –Provides extract-transform-load (ETL) and supports a wide variety of data sources
Data Quality – Features matching, standardization, normalization, profiling, and cleansing capabilities
Data Mashup/Services – Provides server-side data mashup capability
Data Migrator – Provides the ability to migrate database objects across database instances

 

 

Thursday Jan 08, 2009

If you are not interested in the transaction history, a lot of database space could be saved for a Master Index project if the BLOB field which captures the delta for each transaction is not saved in the transaction table.

You can live without this delta if you are not doing and unmerge and not trying to recreate transaction objects from the transaction history.

To achieve a no-history mode for Master Index, as MIDM by default shows the transaction history tab, we need to hide that. Second as we are not interested in saving the delta we should not compute the delta or save that on the table. The first one will save some CPU time and second one will improve the database performance and the overall performance.

To prevent calculation of the delta, we should stop all the TransactionObject.setDelta(TransactionLog.getLogs("Enterprise", eo)) calls in the TransactionMgrImpl.java. We should do this for all public APIs in TransactionMgrImpl.

The transaction record update in the transaction table is two step. The first update writes everything in the transaction table except the delta and the second step is to update the delta for the same transaction record. This is done in individual createTransactionObjectDB method of the particular DBAdapter implementation of the project (could be Oracle, MySQL or SQLServer).
This is done in two steps as the writing back the BLOB to the database requires additional processing. To optimize the performance in no-history mode we must check such mode and stop the delta to be saved to the database.

However the idea of simply deleting the BLOB fields from the database could result in set of serious problems as several methods in TransactionMgrImpl are public and all the basic Master Index functionality invokes these APIs. Whenever such APIs are invoked to set delta or recreate a transaction object from transaction log, there will be severe exceptions as those APIs will try to access the BLOB which does not exist any more.

Conclusion: No-history can be introduced as a project property but has to be carefully handled as described above in the various level of implementation, to prevent functionality failure, optimization, performance and scalability.

Monday Nov 10, 2008

Programmatic login is used with context settings for getting a Mastercontroller handle from MDM client program.

The following parameter needs to be passed to the JVM

-Djava.security.auth.login.config=/appclientlogin.conf

If you are using Netbeans 6.5 and Windows, go to the project properties, at the "run" settings put

-Djava.security.auth.login.config=C:\JavaCAPS6\appserver\lib\appclient\appclientlogin.conf as the VM option.

where PATH is the fully qualified path to the appclientlogin.conf. You can get this file from your glassfish installation. This file should be shipped with your client code.

public MDMClient() {
        init();
        mc = getMasterController();
    }
    
public void init() {
       
        Hashtable env = new Hashtable();
        env.put("org.omg.CORBA.ORBInitialHost", "localhost");
        env.put("org.omg.CORBA.ORBInitialPort", "3700");
        try {
            ic = new InitialContext(env);
        } catch (Exception e) {
            System.out.println("Can not load the initial context");
        }       
    }
   
    public MasterControllerRemote getMasterController() {
       
        MasterControllerRemote mcr = null;
        try {
            ProgrammaticLogin programmaticLogin = new ProgrammaticLogin();           
            Boolean login = programmaticLogin.login("ui", "ui", "file", false);
            mcr = (MasterControllerRemote) ic.lookup("ejb/PersonMasterController");
        } catch (NamingException ex) {
            Logger.getLogger(MDMClient.class.getName()).log(Level.SEVERE, null, ex);
        } catch (Exception ex) {
            System.out.println("Failed programmatic login or getting MC");
        }
        return mcr;
       
    }

Monday Oct 27, 2008

Question: Can we achieve nested transaction in EJB 3 using a single connection, supporting different databases as Oracle, MySQL and SQLServer?

Use case:

1. Start a stateless session bean.

2. Get a connection

3. Update table A. (Transaction 1)

4. Update table B. (Transaction 2)

5. Commit the Transaction 2.

6. Based on some logic commit or rollback Transaction 1.

7. Close connection.

Problem: If we have XA mode, rolling back at step 6 will enforce Transaction 2 to roll back as well?

A solution using Oracle feature: PRAGMA AUTONOMOUS TRANSACTION in oracle actually can make a stored procedure or function independent transaction even if it is nested inside another transaction. For example the sequence ID generator in MDM.

CREATE  OR

REPLACE FUNCTION SEQMGR (Seq_Name_In IN VARCHAR2, Chunk_Size_In IN INTEGER)

RETURN INTEGER

IS

PRAGMA AUTONOMOUS_TRANSACTION;

Count_out INTEGER := 0;

BEGIN

Count_out := 0;

UPDATE SBYN_SEQ_TABLE SET seq_count = seq_count+Chunk_Size_In WHERE seq_name = Seq_Name_In;

SELECT seq_count-Chunk_Size_In INTO Count_out FROM SBYN_SEQ_TABLE WHERE seq_name = Seq_Name_In;

COMMIT;

RETURN Count_out;

EXCEPTION WHEN OTHERS THEN RETURN 0;

END SEQMGR;

Problem 2: MySQL does not support AUTONOMOUS. There could be other databases too which do not support this.

Possible Solution Exploration: java.sql.Savepoint  Interface does not help. It would have helped if the outer transaction committed and the inner transaction rolled back. Use of two UserTransactions does not help if we have the same connection source if we use the XA mode.

Tuesday Oct 14, 2008

Condition: Max connection pool size 10
and running 10 threads each invoking different MasterControllerEJB APIs. All of these APIs fetch a connection from the connection pool. Hence we used up 10 connection. At this point if any one of these threads need to get a new sequence ID from the database, we need another connection. But we can not get this connection as all the connections from the application server pool are exhausted. So this thread will continue to wait here. But as this method to get the ID is a synchronized block on a HashMap, all other threads will continue to wait for the lock on the HashMap to be released and that will never happen. So we are in a deadlock.

synchronized(hashmap) {

    get some value from the hashmap;

    if value meets some codition get a connection from the application server pool;

    use this connection to update some table in database and reload the hashmap;

}

Solution: If you have n threads, always set your minimum connection pool size to n + 2 or higher.

Exploring: java.util.concurrent.atomic.AtomicInteger

Thursday Jul 24, 2008

How to migrate from eView513 to eView 6.

1. Create a Master Index project in NB6 via MDM wizard.
2. Overwrite master.xml, object.xml, mefa.xml, query.xml, update.xml, validation.xml in NB6 project with corresponding files from 5.x project.
    - edm.xml has changed considerably since there will be entirely new EDM. We are working on the details for upgrade.
    - there are some new options in master.xml regarding XA, that should be taken into consideration when overwriting.
3. Ensure that Master Index source classes in these xml configuration files,  uses new package prefix (com.sun.mdm.index) - So overwrite 5.x class prefix "com.stc.eindex" with "com.sun.mdm.index"
4. If there are any plugin classes in 5.x, move them over to $Project-ejb/source-packages. Ensure  new Master Index class name prefix is used in these plugin classes.
5. Use Sun Application server JDBC connection pool configuration instead of Oracle/SQLserver eways. Connection pool configuration is same as in 5.1.3
6. Outbound Topic - In Sun Appserver, create a topic of naming convention <ObjectName>Topic. During run time, this will then publish outbound messages to this topic.
7. 5.x OTD methods (that are invoked from BPEL/JCD) - During Master Index project generation, An <ObjectName>EJB webservice is created during generate.
   Use these methods in the old BPEL/JCD, if these need to be invoked in 5.x environment.
8. Matching - replace the matchconfigfile.cfg with 5.x matchconfigfile.cfg.

9. Standardization - If there is no modification in standardization stuff, then does not require any changes.
Otherwise more details will be spelled out.

10. Generate and deploy to Glassfish.

Additional Requirements:

1. Use MI wizard to create a new Master Index project that mirrors the old eView project, e.g "Person".
2. Copy old edm.xml contents to midm.xml, change 3 edm references to midm:

<edm xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="schema/EDM.xsd">
</edm>

<midm xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="schema/midm.xsd">
</midm>

3. Bring up MI Configuration Editor, make changes to force "Save" action.  Conversion will occur in save action.

4. Check what are converted:
<node-xxxx>  ->  <node> <name>xxxx</name>
<field-xxxx>  ->  <field> <name>xxxx</name>

                            <dashboard> added
<eo-search>  ->  <record-details>
<create-eo> removed
<history>  ->  <transactions>
<matching-review>  -> <duplicate-records> and <assumed-matches>
                       <source-record> added
<reports>  ->  <reports>  //with sub screens
<audit-log>  ->  <audit-log>  //with more atributes

For more details see http://developers.sun.com/docs/javacaps/upgrading/jcapsupgrd.jcapsupgrd.html

This blog copyright 2009 by parijatkar