Thursday, November 17, 2005

mustUnderstand/mustIgnore and Processing Models

David Megginson touches on a really, really important point about the mustUnderstand/mustIgnore model which, in my opinion, is equally applicable to *all* markup and indeed all "signs" created by humans for humans. It cuts to the heart of the extent to which we necessarily embue the signs we create with a point of view...

Kirk: Scottie, I need an unambiguous, context free representation of objective reality and I need it now!

Scottie: The markup cannot take it sir. Every tag I create generates its own point of view of the universe. Cannot de-contextualise...argh!

Its like this. I look at some aspects of the world and I carve it up into pieces. I give those pieces names. Those names become tags/attributes in my XML documents. Now, is my model "correct"? It depends. It depends on two vital things.

1 - how I viewed the world at the time.

2 - the processing model I had in mind for the data.

Both of these are deeply personal and only "true" from the POV of a particular person/instition/business-process.

As Walter Perry points out regularly on xml-dev, the real value of XML is that it reduces the extent to which I force any one processing
model onto others. This enables re-use and innovation in a way that, say, application sharing does not.

The price we pay for this freedom is that designers of XML languages need to find ways to communicate "processing expectations" or
"processing models" separately from the data.

It is still the case today that the true meaning of a chunk of markup is dependent on what some application actually *does* with it. It is not in the data itself. For example, I can create rtf, xml, csv files that are completely valid per the markup requirements but "invalid"
because they fail to meet particular processing models in rtf/xml/csv-aware applications.

This is one reason why HTML as a Web concept (forget about markup for the moment) and XML as a Web concept are so different. With HTML the processing model was a happy family of 1, namely, "lets get this here content rendered nicely onto the screen". With XML the processing
model is an extended family of size...infinity. Who knows what the next guy is going to do with the markup? Who knows what the next processing model will be? Who knows whether or not my segmentation of reality into tags/attributes/content will need the requirements of the next guy.

To date, we have been feeling our way pragmatically on the latter point (we call it "conversion") and largely ignoring the former. The former is becoming much more important. mustIgnore and mustUnderstand will, I suspect, be the topic that begins a long conversation about processing models on the Web.

I would put microformats into this category too. What does a microformat actually do? It takes an existing tag set and tunnels stuff into it. But what it is really doing is taking an existing procesing model (say, XHTML CRUD cycles) with its established toolset and parasiting[1] on top of it in a clever, very human, way. A way that side-steps a bunch of very hard questions about expressing processing models in all their complexity and multiplicity.

[1] parasiting - Is that a word?

Wednesday, November 16, 2005

Open Office - The Movie(s) + a Rainbow

Open Office Training Videos. Free. The book/DVD they are abstracted from looks interesting too.

I've said it before and I'll say it again, go take a look at OpenOffice if you are in any way interested in looking at cheaper/better/cross-platform/truly open vehicles for creating/publishing/managing documents.

The ODF format is growing wings which is just great. Peripherally to the main implications this has, it ushers in what us SGML geeks were looking for back in 1987 (yes, 1987). Namely, a "rainbow DTD" to target during both up and down-translations of structured content.

Sonny boy! Hand me that Zimmerframe.

Tuesday, November 15, 2005

Master Foo on Enterprise Data

    "Think of your centralized database applications as a set of large rocks. Their great strength is their solidity. Their great weakness is their lack of flexibility. Think of spreadsheets as the water that flows over and around the rocks. Their great strength is their flexibility. Their great weakness is their lack of solidity. The easiest route to the far side of a rock is to be like water and flow over or around it, rather than to change the nature of the rock." --
    Master Foo defines 'Enterprise Data'.