Thursday, November 17, 2005

mustUnderstand/mustIgnore and Processing Models

David Megginson touches on a really, really important point about the mustUnderstand/mustIgnore model which, in my opinion, is equally applicable to *all* markup and indeed all "signs" created by humans for humans. It cuts to the heart of the extent to which we necessarily embue the signs we create with a point of view...

Kirk: Scottie, I need an unambiguous, context free representation of objective reality and I need it now!

Scottie: The markup cannot take it sir. Every tag I create generates its own point of view of the universe. Cannot de-contextualise...argh!

Its like this. I look at some aspects of the world and I carve it up into pieces. I give those pieces names. Those names become tags/attributes in my XML documents. Now, is my model "correct"? It depends. It depends on two vital things.

1 - how I viewed the world at the time.

2 - the processing model I had in mind for the data.

Both of these are deeply personal and only "true" from the POV of a particular person/instition/business-process.

As Walter Perry points out regularly on xml-dev, the real value of XML is that it reduces the extent to which I force any one processing
model onto others. This enables re-use and innovation in a way that, say, application sharing does not.

The price we pay for this freedom is that designers of XML languages need to find ways to communicate "processing expectations" or
"processing models" separately from the data.

It is still the case today that the true meaning of a chunk of markup is dependent on what some application actually *does* with it. It is not in the data itself. For example, I can create rtf, xml, csv files that are completely valid per the markup requirements but "invalid"
because they fail to meet particular processing models in rtf/xml/csv-aware applications.

This is one reason why HTML as a Web concept (forget about markup for the moment) and XML as a Web concept are so different. With HTML the processing model was a happy family of 1, namely, "lets get this here content rendered nicely onto the screen". With XML the processing
model is an extended family of size...infinity. Who knows what the next guy is going to do with the markup? Who knows what the next processing model will be? Who knows whether or not my segmentation of reality into tags/attributes/content will need the requirements of the next guy.

To date, we have been feeling our way pragmatically on the latter point (we call it "conversion") and largely ignoring the former. The former is becoming much more important. mustIgnore and mustUnderstand will, I suspect, be the topic that begins a long conversation about processing models on the Web.

I would put microformats into this category too. What does a microformat actually do? It takes an existing tag set and tunnels stuff into it. But what it is really doing is taking an existing procesing model (say, XHTML CRUD cycles) with its established toolset and parasiting[1] on top of it in a clever, very human, way. A way that side-steps a bunch of very hard questions about expressing processing models in all their complexity and multiplicity.

[1] parasiting - Is that a word?

No comments: