Wednesday, March 22, 2006

Humans first, machines second - Hi XML and Lo XML

I recommend reading all of this twice.
The future of structured content in my opinion is content tunnelled inside human readable content. *Not* machine readable content that can be converted/published for human readability.

Microformats have that worse-is-better feel to them. Reminds me of HTML when I first looked at in through the eyes of an SGML structure bigot.

Watch out for both ODF and XHTML having a big role to play in this new world.

Folks will increasingly stop creating new XML-based languages at the front-end. Instead, the XML-based semantics will be tunnelled into a small set of existing vocabularies, most notably XHTML, RSS/Atom and ODF.

Semantic Steganography is on a roll.

This isn't to say that structured content ala XML has lost. On the contrary. It has won but has done so in a way that view would have predicted. To borrow a meme doing the rounds at the moment concerning REST, we will have hi-XML and lo-XML.

Hi-XML is classic XML-think in which the custom schema is the center of the all things and present everywhere in the toolchain from the back-end to the front end.

Lo-XML is semantic steganography. Using existing kit at the front-ends (author/edit and publish) and only using Hi-XML at the back end for contenxt-sensitive search, custom data processing etc.

It is generally quite simple to convert between hi-XML and lo-XML and visa versa. You just need to be clever in your use of attributes for element-type semantics and the use of ranked elements like h1,h2,h3 etc. to create hierarchy programmatically rather than explicitly.

Of course, what you do not get is grammer based author/edit constraints but in my experience, these are greatly over-rated.

No comments: