Words in Boxes

Nouns, verbs, and occasionally adjectives.

Thursday, January 08, 2009

What's Your Semantic Exit Strategy?

Quoting Cafe Con Leche quoting A List Apart:

We’ll start by posing the question: “why are we inventing these new elements?” A reasonable answer would be: “because HTML lacks semantic richness, and by adding these elements, we increase the semantic richness of HTML—that can’t be bad, can it?”

By adding these elements, we are addressing the need for greater semantic capability in HTML, but only within a narrow scope. No matter how many elements we bolt on, we will always think of more semantic goodness to add to HTML. And so, having added as many new elements as we like, we still won’t have solved the problem. We don’t need to add specific terms to the vocabulary of HTML, we need to add a mechanism that allows semantic richness to be added to a document as required. In technical terms, we need to make HTML extensible. HTML 5 proposes no mechanism for extensibility.

HTML 5, therefore, implements a feature that breaks a sizable percentage of current browsers, and doesn’t really allow us to add richer semantics to the language at all.

This is an important warning for anyone creating or maintaining a custom XML document schema. Creating a tight, semantically correct element for every type of information in your existing documents is easy (and dangerously seductive); creating a mechanism by which the schema's users themselves can gracefully expand the semantic tagging as documents grow is much harder, and ultimately much more important.  Be as general as you can get away with, and as specific as you dare.

I'm James Sulak, a software developer in Houston, Texas.

You can also find me on Twitter, or if you're curious, on my old-fashioned home page. If you want to contact me directly, you can e-mail comments@wordsinboxes.com.