Download NetBeans!

20070729 Sunday July 29, 2007

Checking HTML Links Across Helpsets

In Link Checking and Iterating Through Folders, I described how one would need to iterate through folders to determine whether HTML links resolve correctly or not. However, that was only within helpsets. The NetBeans Platform also supports linking between helpsets (or across helpsets). Maybe you have one topic that references a topic that is defined in a different module within the same application. These links have their own syntax. For example, a standard HTML link goes like this:

<a href="../debug/breakpoint_about.html">About Debugging Java Applications</a>

Here, a link checker would need to go one folder up, then look for a folder called 'debug' and then look for a topic called 'breakpint_about.html'. That's relatively simple and the approach is described in the topic referenced above. However, a cross helpset HTML link is more complex:

<a href="nbdocs://org.netbeans.modules.usersguide/org/netbeans/modules/usersguide/xml/xml_validate.html">Validating an XML Document</a>

Somehow, from that, one would need to extract the top level folder of the module in which the xml_validate.html topic is found, as well as the folder "xml", if possible, and then the topic itself. That's quite a bit more work, especially since one would want the code to be generic, so that it could be used from all modules to any other module. That means that one cannot assume that the top level folder, to which one must traverse, somehow, is at a fixed location. Instead, one needs to go up a level, look for the name of the folder, match it with something from the link above, and then, if the match fails, continue going up indefinitely. Then, once the top level folder (e.g., 'usersguide' or 'j2ee') has been found, one needs to look within that folder for the right subfolder containing the help topic.

I'm pretty sure that my solution doesn't work yet in all cases (need to try out more cases for at least a week before publishing the module), but it definitely works in some, as can be seen from this screenshot, which shows two different annotations, one for a broken cross-helpset link and one for a broken intra-helpset link, together with information in the form of hyperlinks in the Output window:

I made the annotations different (i.e., slightly different color and a different icon) so that one can see in one glance whether the problematic link is cross-helpset or intra-helpset. Notice that the Output window shows that there are two cross-helpset links that have been resolved correctly, while one cross-helpset link has been found to be problematic (pink annotation in the editor). In addition, there is one intra-helpset link that is problematic (red annotation in the editor).

Update. I updated the HTML Link Checker on the Plugin Portal, with the functionality described above, because thus far it works as expected. Go here to get it, currently at version 2.0.

Jul 29 2007, 03:09:35 AM PDT Permalink