Tuesday, May 30, 2006

A mini-rant about document XML formats

Question: How many XML formats does the world need for documents?

Answer: N+1 where N is the number of document processing applications with functional differences that require peristent storage.

Why is this?: Because opaque binary file formats for documents (excepting sane non-proprietary binary files like zips etc.) is plain silly these days. Binary file formats Just say 'NO' and all of that.

Because applications differ in their data models whenever they differ in functional differences in their feature sets. This is all good. May a thousand flowers bloom as long as they all can persist their data to disk in an XML notation that I can process without recourse to some vendor-specific API. APIs can be harmful.

So what is the N+1'th document format? The interchange format. A format rich enough to retain as much of a document-centric data model that the N major applications can sign up to but not so much that the N major applications have no room to differentiate themselves.

The N+1'th format must be completely open, vendor neutral and everybody should have a say in how it evolves.

Why bother with all of this? Because the alternative is that one of the N becomes the N+1'th model too. A single application that owns both its own application space and the universal interoperability space for documents.

Not good. That is the world we have come from and we should not go back there.

Does it matter if M (where M < N) of these document models have standards-body-ratified XML schemas? Yes, because standards bodies should be operating in interoperability space, not applicaton space.

If multiple "standards bodies" ratify multiple interop schemas for documents then you have yourself a confusing standards war. Not good. Unfortunately, I fear that is where we are headed :-(

No comments: