Words in Boxes

Nouns, verbs, and occasionally adjectives.

Wednesday, July 16, 2008

Documenting XSLT

Many XSLT developers — including me — are not programmers by formal CS training.  But because XSLT is a relatively small domain-specific language, this usually isn't a big deal.  In fact, it's almost an advantage, since XSLT's functional-programming weirdness is very different than the procedural languages that most programmers use.  And while we XSLT developers don't always conform to software development best practices, it usually doesn't matter, since most transforms are more doghouse than skyscraper.

Usually is the key word.  When you start to build anything larger than about 500 lines of code (or more than one physical file), the quick-and-dirty approach starts to become less quick and more dirty.  Those best practices start to look ... better.

That's why I'm really excited by a couple of projects.  Most "real" languages have a method generate project documentation from comments embedded across multiple source documents.  XSLT does not.  That's the gap these projects attempt to fill. 

The first is DOXSL, which is being developed by Jim Earley.  (Jim did some excellent consulting work for my employer a couple years ago).  He sums up the need really well:

... diagnosing XSLT is a real headache. Templates can be scattered through dozens of physical files linked together through a maze of imports and includes, and knowing exactly which template is firing for a particular context can be enigmatic.

This is really frustrating for developers who want to extend an XSLT stylesheet application or modify the output behavior for a few elements. Which templates, parameters, attribute-sets, or other core components do I override? What is the logical flow? Which file is this named template located in? To answer these kinds of questions, developers can waste hours digging into the code trying to understand the logic. Even with a good set of tools to search the countless number of stylesheets for a specific text string (e.g., a template named 'foo'), there is no guarantee that there will only be a single instance given XSLT's import precedence behavior.  Developers have to trace through the import and include stack to determine which template will be fired.

When you run DOXSL, it creates a report (HTML, DITA, or Docbook) of an application's stylesheets, templates, functions, and parameters.  The coolest bit is that the report tells you if any given template overrides or is overridden by another.  If you use DOXSL elements (instead of standard XML comments) to comment your code, those comments will also be in the report.

Even if you lack the discipline to comment your code with those elements, it DOXSL is still useful — I've already put it to use interpreting Arbortext Editor's Styler-to-XSL application.  I definitely recommend checking DOXSL out, and I look forward to future developments.

The second project is XSLTdoc.  I don't have enough experience with it to have a solid opinion, but it looks promising.  XSLTdoc is modeled closely off of Javadoc.  Compared to DOXSL, the default output looks better, but that's nothing a bit of css work couldn't fix.  The big difference — and this is huge — is that it doesn't tell you when a template in one stylesheet is overridden by another in a different file.  Still, XSLTdoc is under active development, so it's worth checking back to see how it progresses.

I'm James Sulak, a software developer in Houston, Texas.

You can also find me on Twitter, or if you're curious, on my old-fashioned home page. If you want to contact me directly, you can e-mail comments@wordsinboxes.com.