Mark A. Basler's Weblog

All | Java
20070813 Monday August 13, 2007

The second article in a series on Mashups Styles has just been released..

In the article "Mashup Styles, Part 2: Client-Side Mashups" you'll learn how a client-side mashup works, see an example of a client-side mashup in Pet Store -- as is the case in the previous article, this article refers to the Java Pet Store 2.0 application simply as Pet Store -- and explore some of the considerations related to the client-side mashup style. In addition, you'll learn how to make services and content on your site available to others for use in client-side mashups.

The BluePrints Java Pet Store 2.0 Reference Application is used as an example and the source code can be downloaded for the user to take a closer look at the code.

If you want all the BluePrints source in a convenient development environment with ant scripts to facilitate the deployment and modification of the code, you can download the Java EE 5 SDK now and be up and running very quickly.

Hope this helps...

Thanks - Mark



Posted by basler Aug 13 2007, 10:20:45 AM PDT Permalink

20070516 Wednesday May 16, 2007

The first article in a series on Mashups Styles has just been released,

The article "Mashup Styles, Part 1: Server-Side Mashups" focuses on the server-side proxy style of mashup, where the client, usually a browser, communicates to a server-side proxy component that then communicates to the mashup service.  The response from the mashup service is returned to the server-side proxy, which can be validated and/or cached for better performance.  Once this server-side proxy component has performed its function, the response is then sent back to the client.  This type of mashup helps work around the domain of origin security constraint that the XMLHttpRequest has and mitigates the risk of using a client-side mashup facilitated by creating dynamic script tags in the web pages using a scripting language.

The BluePrints Java Pet Store 2.0 reference application is used as an example and the source code can be downloaded for the user to take a closer look at the code. 

If you want all the BluePrints source in a convenient development environment with ant scripts to facilitate the deployment and modification of the code, you can download the Java EE 5 SDK now and be up and running very quickly.

Hope this helps...

Thanks - Mark


Posted by basler May 16 2007, 09:50:00 AM PDT Permalink

20070322 Thursday March 22, 2007

Introduction to Petstore 2.0 article just released...

The article "Introducing the Java Pet Store 2.0 Application" has just been released on java.sun.com. The article walks you through page by page and explains some of the functionality that has been implemented and provides an overview of the approach that we took.

We have also hosted the Petstore 2.0 and the BluePrints Solutions Catalog reference applications live on developers.sun.com. This allows you to play with them before you download the source to see how it was actually coded. These applications and more are hosted as part of Sun's Web Developer Resource Center.

Take a few minutes and check them out...

Hope this helps...

Thanks - Mark

Posted by basler Mar 22 2007, 02:14:42 PM PDT Permalink

20070131 Wednesday January 31, 2007

Validation for Petstore 2.0's uploaded data through Ajax...

Recently I have been working on refactoring the error handling of the just released Petstore 2.0 application.  I started by updating the validation mechanism for items that a Seller could post for sale.  The uploaded form is submitted using an Ajax call in multi-part mime format.  The Ajax file upload uses the Dojo library to submit the form and I packaged all the functionality into and easy to use JSF 1.2 component available through the Java EE 5 version of the Java Blueprints Solutions Catalog 

The validation of the form has multiple facets which are as follows:


Client Side Validation

The first part was completed using typical web techniques by coding Javascript to check the form's data before it is submitted.  If any of the required fields weren't entered or not in the proper format a message is shown to the user and the form submission is terminated.  I chose to show all the errors on the page to the user at one time versus showing the user one error at a time.  I think this is a far better technique, so the user can try and fix all the errors at once, instead of fixing one at a time and then resubmitting the form to see if there are any other errors.  Part of this validation included checking to make sure the uploaded file has the proper suffix (.jpg, .gif or .png) and that the description field didn't contain a script or link tag for security.  If Javascript files were permitted to be uploaded or display fields where allowed to have script elements in them then there is a potential security hole that hackers could use to hijack the page or misuse your resources. Below is an example of the client-side validation message that could be presented to a user:


Petstore 2.0 Seller Client-Side Validation

Captchas Validation
The Captchas functionality was coded by Yutaka Yoshida. The Captchas validation was implemented utilizing a servlet filter.  The CaptchaValidateFilter checks the uploaded Captchas value that is sent in a cookie (so the multi-part mime stream doesn't have to be touched) and validates it against a corresponding string that is set in the session when the Seller upload page is rendered.  If the case insensitive values matches, then the upload proceeds.  If not, then a status object is set with the error so it can be read by the client and shown to the user, then the upload is discontinued.


Upload Size Validation
For maintenance and security purposes, I put a limit on the upload size of approximately 100K.  I didn't want the users to be able to upload monstrous images which could cause a maintenance problem in terms of disk space and put unnecessary load on the server, which could cause a form of denial of service attack.  Since the upload is in multi-part mime format, allowances has to be made for accompanying data and multi-part mime overhead.  I limited the overall upload to 150K which should provide ample room for a 100K image to be uploaded.  If the upload size was less than 150K,  then the upload proceeds.  If not, then a status object is set with the error so it can be read by the client and shown to the user, then the upload is discontinued.  I added this validation check to the servlet filter so processing would stop as early as possible, if the upload was too large.


Server Side Validation
On the server side the validation that was performed on the client is reimplemented.  This functionality is necessary to catch cases where users disable Javascript on their browser or try to hack a submission by sending the request directly to the server using some alternate method.  The risk of submissions being sent by the browser without Javascript turned on is mitigated by performing the actual submission of the form using Javascript, but utilizing other tools like plug ins, this preliminary defense can easily be overcome.  With server-side validation you really must show all the errors for the upload at one time versus showing the user one error at a time. If you don't then the user must keep submitting the data and endure the propagation delay to see the next error.  This functionality was implemented in each entity's specific class, for example, the Address class validates its own data, but since the Item class is a composite class, it validates is own data and calls the validation methods for any of the classes that it contains, like the Address class.

This is just an quick introduction to the validation techniques that were used in the Petstore 2.0 Seller file upload page.  We will be writing a full blown article on the uploading of data that will cover these topics and more in greater depth and will also include coding samples.

To get notified of the release for any of Java Blueprints projects,  you can check the RSS feed located on blueprints.dev.java.net.

Hope this helps ...

Thanks - Mark



Posted by basler Jan 31 2007, 12:02:02 PM PST Permalink Comments [1]

20070124 Wednesday January 24, 2007

Adding Mashups like Digg to your page

Recently I added a mashup to Digg in the soon to be released Java Blueprint Solutions Catalog viewer.  It was as easy as formulating a URL with the appropriate query string that is dynamically populated as the user clicks on the different articles/examples that the viewer combines.  

Blueprints Solutions Catalog Viewer


When the Digg image is clicked then the required information is sent to digg.com in the query string of the request.  The Digg site then parses the request and continues prompting the user.  If the article exists, Digg will display it and prompt the user for comments (below), if it doesn't exist then the user can submit the article for others to Digg.   For example this Article URL will show something like the page below from Digg:

Digg mashup result page



When creating mashups, there are different types of techniques to consider like, client-side mashups, server-side mashups and delegated mashups.  Each have specific applications that they are suited for and are briefly described below:

Client-Side Mashup
A client-side mashup is where the integration/interaction with the mashup occurs in the client, like a browser.  Google Maps is a great example of a client-side mashup.  To utilize Google Maps you include Google's Javascript file in your web page and use the document API to create a wide variety of user interactive maps. Google also provides numerous examples to get your page prototyped quickly.  The Blueprints Solutions Catalog wrapped the Google Maps functionality in an easy to use JSF 1.2 component that you can utilized by simply packaging the component jar with your application.  This component jar is also used in the Petstore 2.0 reference application, so you can take a look at the component in action.

Server-Side Mashup
A server-side mashup has a server component that acts as a proxy to an exposed service.  The server-side component(s) could add custom functionality to an exposed service, consolidate multiple exposed services or just pass through the result directly to the client for use.  This type of mashup is used in the Petstore 2.0 reference application to retrieve the GeoCoding of an address for presentation on the Google Map and for retrieving an RSS feed for the news bar that is in the header of each page.  The Blueprints Solutions Catalog wrapped the  RSS bar into an easy to use JSF 1.2 component that you can utilized by simply packaging the component jar with your application.

Delegated Mashup
A delegated mashup is when information is forwarded to a site that then presents the result for the request.  Like Digg, there are many other sites that use this form of interaction.  PayPal is another example of a delegated mashup where a request is submitted to the PayPal site and then they continue prompting the user.  Some sites like PayPal even provide a parameter for a call-back URL in the query string of the request.  Once PayPal finishes the transaction, a request is sent to the call-back URL with optional custom fields populated.  The Blueprints Solutions Catalog wrapped the PayPal functionality in an easy to use JSF 1.2 component that you can utilized by simply packaging the component jar with your application.  This component jar is also used in the Petstore 2.0 reference application, so you can take a look at the component in action.

This is just an quick introduction to mashup techniques that were used in the Blueprints Solutions Catalog and the Petstore 2.0 reference applications.  We will be writing a full blown article on this topic that will cover these subjects and more in greater depth and will also include coding samples.

To get notified of the release for any of Java Blueprints projects,  you can check the RSS feed located on blueprints.dev.java.net.

Hope this helps ...

Thanks - Mark



Posted by basler Jan 24 2007, 02:35:44 PM PST Permalink

20061127 Monday November 27, 2006

The last articles in the series named Hands-On Java EE 5 with AJAX has just be released on java.sun.com

The last 2 articles in a 5 article series have just been released on java.sun.com. They are made to target developers who are looking at retrofitting a legacy application with AJAX. The later articles introduce JSF components and customizations. It is basically showing how to limit the liability of introducing the latest technologies into an already existing application.


The four article series is named "Hands-On Java EE 5" and contains the following articles:

AJAX Design Strategies
The article "AJAX Design Strategies" talks about different design strategies on implementing AJAX functionality. This article is accompanied by the first articles in a five article series on "Hands-On Java EE 5"
The article is written by Ed Ort and Mark Basler


Creating an AJAX-Enabled Application, a Do-It-Yourself Approach
The article "Creating an AJAX-Enabled Application, a Do-It-Yourself Approach", the second in a five article series talks about how to add AJAX functionality to a legacy application.
The article is written by Rick Palkovic and Mark Basler


Creating an AJAX-Enabled Application, a Toolkit Approach
The article "Creating an AJAX-Enabled Application, a Toolkit Approach", the third in the series talks about how to add AJAX functionality to a legacy application using Dojo.
The article is written by Rick Palkovic and Mark Basler


Creating an Ajax-Enabled Application, a Component Approach
The article Creating an Ajax-Enabled Application, a Component Approach, the fourth in the series talks about an implementation where you create a custom JSF component to generate the JavaScript that executes the Ajax interactions on the client side. The JSF component is packaged with the application bundle and accessed directly. A Java Servlet on the server side responds to the Ajax request.
The article is written by Rick Palkovic and Mark Basler


Creating an Ajax-Enabled Application, a Phase Listener Approach
The article Creating an Ajax-Enabled Application, a Phase Listener Approach, the fifth in the series talks about an implementation where you expand the JavaServer Faces approach to include a phase listener component on the server that fufills the client component's resource requirements.
The article is written by Rick Palkovic and Mark Basler


Hope this helps...

Thanks - Mark


Posted by basler Nov 27 2006, 10:58:33 AM PST Permalink

20061011 Wednesday October 11, 2006

The first articles in series named Hands-On Java EE 5 with AJAX has just be released on java.sun.com

The first articles in a 6 article series have just been released on java.sun.com. They are made to target developers who are looking at retrofitting a legacy application with AJAX. The later articles introduce JSF components and customizations. It is basically showing how to limit the liability of introducing the latest technologies into an already existing application.


The six article series is named "Hands-On Java EE 5" and currently contains the following articles:

AJAX Design Strategies
The article "AJAX Design Strategies" talks about different design strategies on implementing AJAX functionality. This article is accompanied by the first articles in a five article series on "Hands-On Java EE 5"
The article is written by Ed Ort and Mark Basler


Creating an AJAX-Enabled Application, a Do-It-Yourself Approach
The article "Creating an AJAX-Enabled Application, a Do-It-Yourself Approach", the first in a six article series talks about how to add AJAX functionality to a legacy application.
The article is written by Rick Palkovic and Mark Basler


Creating an AJAX-Enabled Application, a Toolkit Approach
The article "Creating an AJAX-Enabled Application, a Toolkit Approach", the second in a six article series talks about how to add AJAX functionality to a legacy application using Dojo.
The article is written by Rick Palkovic and Mark Basler


Future articles will be released in the upcoming weeks and will be accessible from the main "Hands-On Java EE 5" page.

Hope this helps...

Thanks - Mark


Posted by basler Oct 11 2006, 11:19:03 AM PDT Permalink Comments [4]

20060905 Tuesday September 05, 2006

Migrating your CVS server to Solaris 10

I have recently upgraded my server to Solaris 10.  I went to re-install my CVS pserver and ran into a quick snag.  Solaris 10 no longer manages services by editing the inetd.conf  file.

I tried and Google the problem, but got a lot of irrelevant hits, I eventually found that I had to use inetconv to convert the inetd.conf file to the new SMF format. 

An edited excerpt from the inetd Solaris 10 man page is below:

In Solaris 10, services are no longer managed by editing the inetd configuration file, inetd.conf.  Instead, you use inetconv to convert the configuration file content into SMF format services, then manage these services using inetadm and svcadm. Once a service has been converted by inetconv, any changes to the legacy data in the inetd config file will not become effective. However, inetd does alert the administrator when it notices change in the configuration file.

More information can be found in BigAdmin's tech tip "Enabling TCP Wrappers in the Solaris 10 Operating System".

I did update the CVS twiki manual, but wanted to make sure others didn't have any problems, so I reposted the excerpt here.

Hope this helps - Mark

Posted by basler Sep 05 2006, 10:28:50 AM PDT Permalink

20060830 Wednesday August 30, 2006

Two articles about JavaServer Faces (JSF) and AJAX where recently published on java.sun.com...

Recently, two useful articles on JSF and AJAX have been released on java.sun.com to appeal to a developer who is experimenting with AJAX enabled JSF components.   They are as follows:


The articles depict advanced techniques in a way that should be understood by a developer that has limited experience developing JSF components.

More advanced write-ups on JSF, AJAX and the Java Persistence API can be found in the Java BluePrints Solutions Catalog.  There are also useful AJAX enabled JSF component libraries available for use in your own applications.

Hope this helps - Mark


Posted by basler Aug 30 2006, 09:39:52 AM PDT Permalink

20060818 Friday August 18, 2006

The Java BluePrints Team has just corrected and unintentional omission in the Java BluePrints Solutions Catalog Java EE 5

The Java BluePrints Team has just corrected and unintentional omission in the Java BluePrints Solutions Catalog Java EE 5 version.  In the previous versions of the Java EE 5 Solutions Catalog, only the Java EE 5 components had their source code included.  The J2EE 1.4 components binary library was included, but the source was accidently omitted. 

We have just created a second distribution that includes the J2EE 1.4 component's source and a simple ant build system, so the components can be built and the resultant jar can be used in your application or the jar can replace the J2EE 1.4 library that the Java EE 5 Solutions Catalog uses.

The second distribution entitled "bpcatalog-14-ea-0.6-installer.jar" includes the following components' source:


Future versions of the Solutions Catalog will include these bits.

Sorry for any inconvenience - Thanks - Mark

Posted by basler Aug 18 2006, 02:09:58 PM PDT Permalink

20060804 Friday August 04, 2006

The Java BluePrints Petstore 2.0 Reference Application that uses Web 2.0 technologies like AJAX, RSS and client-side/server-side mashups is now available for download...

The  Java BluePrints Petstore 2.0 Application is a reference application designed to illustrate how the Java EE 5 platform can be used with Web 2.0 technologies.  It shows real-world examples of AJAX, RSS and client-side/server-side mashups that can be used with Java EE 5 technologies like JavaServer Faces (JSF) and Java Persistence APIs.  The application comes with full source-code that utilizes the Google Maps service for location specific searches of pets, the PayPal service for purchases, an RSS feed as a data source, and much more.

Many of the AJAX enabled JSF components, like the Popup Balloon, FileUpload with Progress Bar, Auto-Complete, Rating, PayPal mashup, Google Map mashup, Yahoo GEO Code Mashup, and RSS Bar came from the Java BluePrints Solutions Catalog that can also be downloaded and used in your web applications.  The components and write-ups that the Solutions Catalog contains are described in more detail in a previous entry from my blog.

All that needs to be done to get started is to download and install Sun's Open Source Application Server, Glassfish and then download, configure, and deploy the Petstore 2.0 reference application.  This is a great opportunity to jump start you and your web application with Web 2.0 functionality with a minimal learning curve.

Below are some screen shots of the main pages in the application with brief descriptions of the pages functionality.

Hope this helps - Thanks - Mark



Main Catalog Page

Catalog Screen

This screen shot is of the main catalog page that contains a large amount of functionality and is the center of the Petstore 2.0 reference application.  Some of the main functionality it includes is as follows:



PayPal Page

PayPal Mashup

This is a screen shot of the PayPal Developer sandbox page that is populated with the catalog item data from the Petstore's database.  This page is presented when the "PayPal" button is clicked on from the Petstore's Catalog Page.



Apache's Lucene Search Engine Page

Search Page

This page is the front-end search page that utilizes the Apache's Lucene Search Engine on the server.  This page searches the indexes and displays a list of results that matches all the entered words in the search string. The AJAX Popup Balloon component that shows more detailed information is triggered when the user mouses-over the name for more than a second.  This page is also a lead- in to the Google Maps client-side mash-up page.  The user can select the items that they want shown on the map and by entering an optional center-point address with a search area (in miles), can further restrict the map to show only the items in the specific area around the center point.

More information about our experiences using the Apache's Lucene Search Engine can be found in a previous entry from my blog.



Google Map Search Page

Map Search Page

This page is the front-end map page that utilizes the database through the Java Persistence APIs to retrieve catalog items by category.  This page leads into the Google Maps client-side mashup page and displays the catalog items on the map for the selected category.  By entering an optional center-point address with a search area (in miles), the user can further restrict the category to show only the items in the specific area around the center point.



Google Map Mashup Page

Map Page

This screen shot is of the Google Maps client-side mashup page and shows the selected items, either from the Lucene Search page or the Map Search page.  Some of the main functionality it includes is as follows:


Seller Catalog Item Upload Page

Seller Page

This screen shot is of the second pane of the Seller Catalog Item Upload Page and shows the item's image being uploaded to the server while displaying the status of the upload to the user.  Some of the main functionality it includes is as follows:




Posted by basler Aug 04 2006, 01:51:56 PM PDT Permalink

20060717 Monday July 17, 2006

A new version of the Java Blueprints Solutions Catalog for JavaEE 5 has been released...


A new version of the Java Blueprints Solutions Catalog for JavaEE 5 has been released.  This release features new and updated writeups on the approaches concerning JavaServer Faces, AJAX and the JavaEE 5 Persistence APIs.  This release also contains JSF component libraries that you can use in your application

The catalog contains the writeups below that present some solutions when using AJAX and Java EE technologies in web applications for the Java EE 5 platform and are as follows:


The catalog also contains writeups on the use of the  Java Persistence APIs which allows the management of persistence and object/relational mapping. The writeups below present some solutions when using Java Persistence APIs in applications for the Java EE 5 platform and are as follows:

The Java Blueprints Solutions Catalog for JavaEE 5 also contains component libraries you can use in your applications.  The JSF components that the libraries contain are as follows:

    Library 1: Can be used in both the J2EE 1.4 or the Java EE 5 platform     Library 2: Can be used only on a Java EE 5 platform such as GlassFish.
The complete list of the components that are contained in the Java Blueprints Solution Catalog for JavaEE 5 is now kept on the blueprints.dev.java.net/ajax site.

You can download Glassfish and the source for the Java Blueprints Solutions Catalog for JavaEE 5 and run it for yourself.

Please keep in mind that we are still researching evolving technologies and will update these entries when our recommendations change.

Hope this helps - Thanks - Mark



Posted by basler Jul 17 2006, 10:08:40 AM PDT Permalink

20060424 Monday April 24, 2006

JSF 1.2 Checkbox in a dataTable populating a list of IDs ...


Recently, I have been working on an application that required a list of database primary keys (IDs) be returned from a search, so the items that were selected could be populated in a Google Map.  Returning a list of IDs to operate against is a common requirement and there a number of ways this could be handled.  The one thing that complicated the use of normal paradigms is that persistence entity beans were being used and I didn't want to introduce presentation implementation details in the bean.  Since we are using EJB 3.0 and JSF 1.2, I used the Glassfish Open Source Application Server to run my tests.

I have read a lot of interesting forum postings were developers are trying to find a lite weight approach to collecting IDs to operate against when they are developing using JSF dataTables.  These developers are familiar with the standard HTML checkbox String array approach and are looking for a similar approach using JSF.  Some of the write ups state the example usecase where items are selected from a cart for deletion.  Hopefully this investigation will help them and others.

There are a few different methodologies that can be used to resolve this problem, the following list are approaches that we investigated:

1) Wrap the entity bean with a class that also exposes the values ("SelectItem") for the checkbox.  Having all the values that are to be captured, represented in a bean is a standard approach when working with dataTables.  I didn't want to use this approach because it requires the introduction of a new wrapper object for each page that used entity beans to back the dataTable.

2) Bind the dataTable to a ManagedBean ("HTMLDataTable") so the children can be looped through to manually reconcile the items selected.  This could be done but all that was needed was a list of IDs.  I thought this approach was overkill and I wanted to use something more lite weight.

3) Send the selected values to a the managed bean using a HTML checkbox through a managed property.  This is more in line with what I was looking for, a lite weight approach the I can use in other situations without introducing a new object or binding to a JSF component.

The code segments below shows how to send a list of checked IDs to a managed bean and the associated setting in the faces-config.xml file.  Note that the SearchBean  is the entity bean used to hold the data being displayed.

search.jsp
 <h:form id="resultsForm">
    <h:dataTable id="results" border="1" value="#{SearchBean.hits}" var="item"
rendered="#{SearchBean.showResults}"        
style="border-style:double; width:600px; border-color:darkgreen">
                
...
<input type="checkbox" name="mapSelectedItems"
value="<h:outputText value='#{item.UID}'/>"/>    
...
<h:commandButton action="#{MapBean.findAllByIDs}" id="mapSubmit" type="submit"
value="Map Checked Item(s)" rendered="#{SearchBean.showResults}"/>
...


faces-config.xml
<managed-bean>
<managed-bean-name>MapBean</managed-bean-name>
<managed-bean-class>       
com.sun.javaee.blueprints.mapviewer.MapBean
</managed-bean-class>
<managed-bean-scope>request</managed-bean-scope>
<managed-property>
<property-name>items</property-name>
<value>#{paramValues.mapSelectedItems}</value>
</managed-property>           
</managed-bean>



MapBean.java
// search.jsp
public void setItems(String[] items) {
    this.items=items;
}


The only problem I found with this approach is that the Managed Bean has to be in the request scope.  If you wanted to put the bean in the session, a ServletException would be thrown stating "The scope of the referenced object: #{paramValues.mapSelectedItems} is shorter than the referring object".


4)  Since I wanted the MapBean to be in the session scope so the last map could be re-rendered, I ended up using a derivation of approach 3 and retrieved the IDs using the Expression Language Context to resolve a Value Expression.  This allowed the MapBean to be in the session scope and the IDs could be retrieved from the request.  One thing to note is that since is it quite possible that the request values may not be present when the bean is being access from other objects, some checking is required to keep the IDs available.

The code segments below shows how to retrieve a list of checked IDs from a managed bean and the associated setting in the faces-config.xml file.  Note that the SearchBean is the entity bean used to hold the data being displayed.

search.jsp
<h:form id="resultsForm">
    <h:dataTable id="results" border="1" value="#{SearchBean.hits}" var="item"
rendered="#{SearchBean.showResults}"
style="border-style:double; width:600px;border-color:darkgreen">

...
<input type="checkbox" name="mapSelectedItems"
value="<h:outputText value='#{item.UID}'/>"/>    
            
...
<h:commandButton action="#{MapBean.findAllByIDs}" id="mapSubmit" type="submit"
value="Map Checked Item(s)" rendered="#{SearchBean.showResults}"/>
...

faces-config.xml
<managed-bean>
<managed-bean-name>MapBean</managed-bean-name>
<managed-bean-class>
com.sun.javaee.blueprints.mapviewer.MapBean
</managed-bean-class>
<managed-bean-scope>session</managed-bean-scope>
</managed-bean>


MapBean.java
public String findAllByIDs() {
// get selected items from search
    FacesContext context=FacesContext.getCurrentInstance();
    ValueExpression vex=context.getApplication().getExpressionFactory().
createValueExpression(context.getELContext(),"#{paramValues.mapSelectedItems}", String[].class);

String[] itemx=(String[])vex.getValue(context.getELContext());
// since looking up values from request, make sure the values exist before replacing old values
if(itemx != null) {
itemIds=itemx;
}
...

Many thanks to Ed Burns for taking the time to investigate these scenarios with me. His time and effort is deeply appreciated.


Posted by basler Apr 24 2006, 10:51:23 AM PDT Permalink

20060315 Wednesday March 15, 2006

Javascript, AJAX, DHTML (Web 2.0) debugging tools...

When developing anything that has to do with Javascript like AJAX or DHTML web pages (Web 2.0, if you prefer), it pays to have good debugging tools.

A large amount of pluggins are available for the FireFox browser in addition to the extremely useful embedded Javascript console.  The one I use to debug numerous problems is Live HTTP Header.  It shows the actual HTTP interaction between the browser and a server like Glassfish.   I have found it an invaluable resource in debugging browser/server interactions when developing complex AJAX web applications.

Also, the DOM Inspector that comes with your the FireFox browser (you have to perform a custom install) is very useful debugging the resultant DOM when developing advanced DHTML web pages.

I must credit the AJAX in Action book appendix on tools for these tips. Many Thanks to Greg Murray for making a copy available :-). See his Javascript Recommendations for many useful tips.

One other tool that the book recommends, but I have yet to use in any detail is the Venkman Javascript Debugger.

Hope this helps to cut down on your debugging time - Mark



Posted by basler Mar 15 2006, 09:06:13 AM PST Permalink Comments [1]

20060310 Friday March 10, 2006

Lucene Search Engine, Web Crawlers and Tagging...

Recently, I have been working on adding search functionality to a soon-to-be-released JavaEE5 application that deploys on Glassfish .  After looking over the open source search engine options that were available for us to bundle with the application, we decided on using the Lucene search engine available from Apache. Lucene is a robust search engine that supplies APIs that enable the developer to design an indexing scheme to match their needs.  They also follow the same methodology I strongly believe in, keep it simple (KISS).  Lucene doesn't supply the functionality that actually walks through your data to create the indexes or the functionality to capture the criteria to search/display the indexes, because its roll is strictly to be a search engine. The Lucene Tutorial with the accompanying demo applications walk through common creation/search scenarios depicting a straightforward methodology to write your own interfaces between your data and Lucene.

Web Crawlers:
A common desire is to have a web crawler or robot walk the web site, indexing relevant content so it can be later searched. There are some third party alternatives that can be used like Apache's Nutch which is built on top of Lucene or WebSPHINX that can be modified to store data into Lucene.  These and other third party open source solutions will help save a lot of time.  If you decide to go the web crawling route, you could try to write your own like I did using the JavaSE5 javax.swing.text.html.parser classes, but it is laborious and you will not be spending time focusing on the real problem of getting your data correctly indexed.  I did consider the effort to be a very educational endeavor, but not to be attempted by the faint of heart.

Our Approach:
Considering our needs and the fact that our JavaEE5 application's web interface is largely based on Web 2.0 that utilizes AJAX to present most page content.  We decided to write our own interface that pulls data from the database to create the indexes.  We found when serving page content using AJAX & DHTML, the web crawling paradigm becomes convoluted.  This is due to the fact that it is hard for the crawler to know the relevant content that is associated with a specific item.  This is especially true when the content is retrieved through Javascript events that haven't been fired by the crawler, like a Javascript mouseover.  This problem can be mitigated by methodical use of the robots.txt file and meta tags in the HTML pages that are served to give the web crawler the correct pages with the correct data.  Amazon uses this approach by including meta tags like "description" and "keywords" coupled with a restrictive robots.txt file to help companies like Google index their site correctly. 

We are using meta tags to help external web crawlers index our site properly, but thought our own search results should be as accurate as possible for the items we are offering.  To store the data to be indexed it was easy to use the new persistence APIs in JavaEE5 based on POJOs. With our development environment consisting of Netbeans 5.0 and the Glassfish AppServer development went very smooth.  All that was required to make the Lucene APIs available was to package it with our application.  One note, be careful where you store your indexes.  If you store them under the deployed application directory, they will be removed when you redeploy/update your application.  We decided to store the indexes under the domains lib directory (e.g. "/glassfish/domains/domain1/lib/indexDir"), which can be accessed using the Glassfish System.getProperty("com.sun.aas.instanceRoot") + "/lib/indexDir", but the location is totally up to you.

Tagging:
We also added the functionality to allow user's to add there own custom tags to the items so they are also searchable by other users.  Tagging has become very popular and can be seen by browsing pioneering sites like del.icio.us and flickr.   We wanted the ability to weight the tags based on subsequent user clicks, so the tag information also had to be persisted in the database. 

Updating Indexes in Lucene for Tagging:
One thing to keep in mind is that Lucene doesn't allow an index to be updated, the specific index has to be deleted then re-created.  When adding a new tag to an item or updating a document index you have to be able to access all the data that was originally in the index before re-creating it.  This sounds straightforward but there is on caveat.  If you index items using an approach that doesn't allow retrieval of all the data in the index, you will have to read the data from a persistent store so the index can be completely re-created.  You can get in this state when you create a org.apache.lucene.document.Field for the documents index utilizing the "UnStore" method or "Text" method with a Reader.  When using these methods, the data can't be retrieve via the exposed APIs.  This really isn't a big deal once you factor it in to your approach. Our tagging requirement came after the initial implementation was completed and it caused some problems that made us have to re-think our index scheme.

Conclusion:
I found this task very educational in terms of the Web 2.0 impact of web crawlers and the general working of the Lucene Search Engine.  I will be following up with a more detailed article including code samples once the application is released, but for now let me end this entry with some learned lessons.


- Don't write your own web crawler.  There are many that are available that can be altered to suite most purposes.  It may seem simple, but once you get into the task, you will realize it is not for the faint of heart.  Keep in mind that with Web 2.0, web crawlers are going to become even more complicated to design.

- Do use a robots.txt file to steer external search engines to the relevant content to be indexed.  You most likely don't want irrelevant data, like pages that perform cart functions to be indexed.

- Do use meta tags on your relevant pages that are to be indexed (steered by robots.txt) so the search engine knows exactly what to index.  If you let the web crawler try to figure it out, there is a strong possibility that it will be wrong.  This is even more important if your site uses advanced Web 2.0 features to retrieve content.

- Completely work out your indexing scheme, including updates and tagging if applicable, before you finalize your design.  All the item's data to be indexed must be available to properly re-create the index in the event of update.

- Don't store you indexes where they can be wiped out by a updated version of the application.  Also make sure that the domains server.policy file grants the application read/write access on the directory you have chosen.



I hope this entry helps someone else save time in their development of an indexing approach - Good Luck - Mark


Posted by basler Mar 10 2006, 02:18:03 PM PST Permalink