Customizing the DITA OpenToolkit
This paper outlines a course given by Adena Frazier
of Suite Solutions--a course which is highly recommended for anyone who wants
to get the most of the OT. This paper outlines the most
important processes, but it leaves out many of the details, tips, and debugging
notes that were included in the course. Note, too, that errors easily could
have crept in, and some details are bound to change for later versions of the
toolkit. (We used version 1.4.1) So it makes a lot of sense to take
the course, even if you find the outline useful.
High-level Synopsis
- There are two main production paths: PDF(2) and xHTML
- There is also an "XPP" path for PDF generation that could be
worth examining,
for an organization like ours that already licenses XPP..
- The xHTML path generates 3 help outputs, frames pages,
and individual xHTML pages.
- Each path has a different customization mechanism.
Day One covered some basics like DTDs, installing and running
the DITA Open Toolkit, ANT build files, and .ditaval files.
But it was mostly devoted to the XPath mechanisms for addressing
and selecting nodes (the crux of XSL transforms) and the XSL-FO
format (the crux of PDF generation).
Day Two focused on customizing PDF. Day Three listed the advantages
and disadvantages of the different varieties of xHTML, and
then focused on customizing xHTML.
Day Three also included a review of publishing alternatives,
including XPP, RoboHelp, WebWorks, and DITA_FMx, but didn't
go into detail on any of them.
PDF(2) Path
- These days, people use the "PDF2" plugin for the
DITA toolkit. The plugin was provided by IDIOM.
- It replaces the old path ("PDF") which didn't work
all that well, and which has been deprecated.
- After installing, the plugin will be found in /demo rather
than /plugin.
(The latter directory is
relatively new, and some things still work better
in the old location.)
- The plugin produces XSLT-FO, which is then rendered
by one of the PDF engines (Antenna House or RenderX)
- XSL-FO is an XML format that contains a combination
of content and formatting instructions. (When DITA
content goes through the plugin, formatting instructions
are added to create XSL-F0.)
- Antenna House makes a nice little tool that lets you
see the XSL-FO, make a change, and then see how it's
rendered. That makes it easier to define the XSL-FO
you want. Then you write the transform that produces it.
- With that system, you can have multiple customization
directories, and select the customization you want as
part of the production settings.
- To migrate to a new version of the OT, you copy the
customization directories, as well as the production
scripts. (You may also have to repeat the configuration
step that makes the customizations take effect.)
To customize PDF(2) output:
- a) Make a directory that models the existing one:
demo/
fo/
customization/
yourCustomization/ <-- create in this location
- Copy XSLT files you want to customize (preserving
directory structure) and make changes to the XSLT
in it.
You generally copy the whole file, because
if change point size for one heading, there are
probably a raft of dependent changes you'll want
to make. With the whole file to look at, it's easy
to find things and make changes.
The directory structure is really weird,
with attribute customization files far removed
from element customization files:
yourCustomization/fo/
attr/custom.xsl -- attributes
...(many dirs)...
xsl/custom.xsl -- elements
- Point to that directory using the argument
customization.dir in the (ANT or command line) production script
- Fix in a bug in the "reflection" scripts that
affect titles and list elements. Those elements
generate the style to apply based on the depth
of the element. But in doing so, they hard code
a path to the original templates, which prevents
customizations from taking effect.
To make that
fix:
- Copy demo/fo/xsl/common-attr-set-reflection.xsl
- Look for occurrences of "../../cfg/"
- Modify the path so it goes to your customizationdirectory from
the demo/fo/xsl directory.
- Adjust one other configuration file to
make the customizations take effect.
(Copy catalog.xml.orig to catalog.xml and
uncomment the line that corresponds to the XSL you've overridden.)
XPP Path to PDF
- XPP is made by XyEnterprise. They say "it begins where
XSL-FO leaves off".
(It's entirely possible for that statement to be true.
Typography is nothing if not an idiosyncratic collection
of corner cases. It takes a lot of time to get those
cases coded up, and to get the bugs worked out. They've
been in business a long time. But Antenna House and
RenderX generate pretty darn good PDF, so it would be interesting to compare
the results, and the processes by which they're produced.)
- XPP is what we're using to generate documents from Solbook.
Our genHTML script uses XPP to generate PDF (somehow) from
Solbook files.
- XPP is a rendering engine. It does not process XSL-FO
files. Instead it processes it's own propriety format,
which predates XSL-FO.
- To generate that format, you get the XPP plugin for the
DITA toolkit, which produces it. (XyEnterprise would argue
that there intermediate format is better. Whether or not that's true, it
is certainly the case that it is different.)
- Since XPP requires it's own input format, the Idiom plugin
is of no use for the XPP path. The Antenna House preview-
program is therefore of no use, either. So this path
clearly loses flexibility, due to the loss of standardization
and the loss of a useful tool. What it gains with respect to output
quality has not yet been determined--but it's an option that could
certainly save money for any organization (like ours) that already uses
XPP--once we figure out how to use it.
- Customization mechanisms are not yet known. genHTML clearly
does a ton of customizations, but we don't know anything
about how they work.
XHTML Path
The explanation here represents my best understanding, to date.
It may well be revised when I 've had a chance to play with it more.
- There are production-process arguments for the things
things you almost always want to customize (page header,
page footer, file header, and CSS file)
- For other customizations, you write "plugins".
- Plugins aren't hard to write, and they're easy to migrate
to new versions of the OT. But if you want multiple
customizations, you need different copies of the OT,
each with its own plugin.
To customize xHTML output, use the instructions below. Hopefully, this picture
will help to clarify the relationships between the files (click for a larger
version):
Here's the process:
- Read this file to find out which XSL files are included, and to get the
ID of the extension:
xsl/ditaxhtml_template.xsl
The extension point is defined by this line:
<dita:extension id="dita.xsl.xhtml" ...
(Templates are copied by the integrator in the last
stage of the process. During the copy, your extensions
are substituted, using element IDs as a guide.)
- Most all of the templates you would want to
extend are in:
xsl/xslhtml/dita2htmlImpl.xsl
Look for "STUBS FOR USER PROVIDED OVERRIDE EXTENSIONS"
Pick the thing you want under that. For example:
<xsl:template name="gen-user-header">
<!-- to customize: copy this to your override transform
and add the content you want. -->
<!-- it will be placed in the running heading section of the XHTML. -->
</xsl:template>
Note that the instructional comments seem to be slightly misleading here.
In reality, you
add a "match" phrase and change the "name" attribute
to "mode" That was true for the example we had in class, at
least. Here's what we had:
<xsl:template match="/|node()|@*" mode="gen-user-header">
<div>...title and other stuff here...</div>
</xsl:template>
(The match expression says "root or element node or
attribute". It's just a given, mandated by the
fact that this file is being used to
insert the <div>...</div> content into a copy of dita2xhtml_template
when the integration transform runs.)
Overloaded Terminology Note:
_template.xsl and <xsl:template> both use the word "template",
but in two different ways.
Note:
Many transform templates contain matches like
this:
match="*[contains(@class,' topic/abstract ')...
That pattern looks for an item in the class attribute,
rather than matching an element name. A specialized
element will have multiple entries that names its parent
as well as itself: class="topic/topic myTopic/myTopic" Transforms that operate on the generic topic class will
therefore work, unless a specific "myTopic" transform
has been defined.
- Put the template extensions into a new file:
demo/
yourCustomization/
xsl/
yourTransform.xsl <--
create
where the file looks like this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
...template(s) here...
</xsl:stylesheet>
Note:
You typically only put the transforms you're
changing into the file, to keep things simple.
(CSS takes care of point sizes, so you don't need
to copy the whole file, the way you do in PDF.)
- Create the plugin.xml file that identifies your
extensions so they can be integrated into the
toolkit:
demo/
yourCustomization/
plugin.xml <-- create
xsl/
yourTransform.xsl
...
- Point to the extension file(s) from the plugin file:
<?xml version="1.0" encoding="UTF-8"?>
<plugin id="YourCompany.branding"> <--for example
<feature extension="dita.xsl.html" <--from step (a)
value="xsl/yourTransform.xsl" <--from step (c)
type="file"
/>
/plugin>
- Run the OT integrator build process from the top
directory:
ant integrator.xml
This step copies the xyz_template.xsl files, creating
the xyz.xsl files and substituting your extensions.
Note:
The fo/ plugin also has an integrator process.
But that's not the one you want.