Words in Boxes

Nouns, verbs, and occasionally adjectives.

Wednesday, February 11, 2009

Code Review: Commenting XSLT Regular Expressions

You learn a lot from reading other people's code.  For example, the other day I ran into a clever trick in Jeni Tennison's XSpec code for commenting regular expressions in XSLT:

<xsl:variable name="attribute-regex" as="xs:string">
  <xsl:value-of>
    \s+
    (\S+)        <!-- 1: the name of the attribute -->
    \s*
    =
    \s*
    (       <!-- 2: the value of the attribute (with quotes) -->
      "([^"]*)"  <!-- 3: the value without quotes -->
      |
      '([^']*)'  <!-- 4: also the value without quotes -->
    )
  </xsl:value-of>
</xsl:variable>

The trick is the <xsl:value-of /> instruction, which casts its contents as a string.  An especially nice thing about this method is that you can refer to other variables within the declaration:

   (\S+)    <!-- 12: the name of the element being opened -->
   (        <!-- 13: the attributes of the element -->
     (      <!-- 14: wrapper for the attribute regex -->
       <xsl:value-of select="$attribute-regex" />  <!-- 15-18 attribute stuff -->
     )*
   )

Of course, to ignore all the extra white space in a regex constructed this way, you'll need to set the "x" flag in any <xsl:analyze-string />, replace(), or matches() that refers to it.

I'm James Sulak, a software developer in Houston, Texas.

You can also find me on Twitter, or if you're curious, on my old-fashioned home page. If you want to contact me directly, you can e-mail comments@wordsinboxes.com.