Saturday, April 24, 2004

Question: when to pull and when to push in a distributed system

E-mail clients use POP3 or IMAP4 protocols. Both are "pull" in the sense that the client connects and asks for stuff. When SMTP servers talk to each other, they "push" in the sense that one SMTP server says "Here is the stuff I have for you".

I'm interesting in examples of push and pull semantics. I'm looking for patterns (if they exist) in the use of push/pull from original-producer to final-comsumer in distributed data exchange.

My sense is that pull dominates the final-consumer hop, push dominates the original--producer hop and both push and pull models are common in intermediate hops.

Opinions? References?

More Relax NG

Some more questions from Greg
on the subject of RelaxNG.

(1) Greg suggests that the readability is down to the non-XML syntax.

I believe that while this is definitely a contributing factor, it is far from being the whole store. I find the XML notation almost as easy to read but more difficult to write. So, I write in RNC and trang (a new English verb:-) to RNG.

Contrast the compact syntax of RelaxNG with, for example, Eric Wilde's compact XML syntax for W3C XML Schema - xscs.

As the comments to that article show, a new syntax does not address the fundamental complexity issues of W3C XML schema.

I'm quoting Eric here:

    "xml schema's complexity may be a problem for a number of (potential) xml schema users, but since we only wanted to create a new *syntax*, i think it's not appropriate to criticize the syntax for the complexity. any syntax that does not hide certain facets of xml schema has to expose xml schema's complexity."

Greg goes on to talk about the dual lexical/sytactical nature of XML and talks about cases where XML provides a convenience for programmers (who don't have to write the bottom layers of language parsers) at the expense of humans who have to read/write the stuff.

Greg asks "where to draw the line"? I wrote about this in RDF and other monkey wrenches where I used the phrase "semantic shadows" to refer to the idea of a human syntax and a separate, isomorphic, automatically generated XML representation for any
language. Examples: N3 in the case of RDF, RNC in the case of RNG etc.

We can get the best of both worlds. For any syntax that requires human read/write grokability, create a human-oriented and a machine-oriented notation. Use *text* for the human oriented notation and XML for the machine oriented notation. The two notations must be isomorphic and automataically convertable from one notation to the other. Programmers
needing to work with the notation work with the XML and thus avoid having to deal with the lower layers of language parsing. Humans work with the compact syntax.

Using some simple timestamp algorithms such as the one python uses to turn .py into .pyc would allow the human or the XML notations to be edited with background "compilation" to the other syntax happening as required.

So where to draw the line? I think it is worth looking at an isomorphic compact syntax when the density of repetitive "scaffolding" markup becomes an inhibitor to cognition.

Quoting Greg here:

    "Perhaps in such cases a 'dual' approach is an optmal one? (an underlying implementation always XML + a human/programmer-facing readable extra such as the RelaxNG's compact syntax?)"

Yes. Absolutely.

(2) Greg asks

    "The RelaxNG syntax seems to miss the inheritance mechanism. Would you see that as a limitation for designing the document-centric world?"

No. I would see it as a positive contribution to the document-centric world. As I mentioned in a recent post linguistic determinism is alive and well in artificial languages. If a language provides idioms that allow you to think in OO terms, you will see things in OO terms.

Neither inheritance nor encapsulation of data/behavior, esposed by OO are key idioms of document-centric integration.

Greg says

    "More specifically, I feel that OO-like modelling approch is an excellent one for specifying the relatively primitive building blocks of the documents (such as address, customer, and what have you). As such, it seems to be an excellent 'support technique'."

I disagree here. Even in simple contructs like address, customer and so on, you need the freedoms the XML provides to intermix structure with text, to nest structures and to make structures recursive.

XML frees you from the modelling strictures created by the false dualism between objects and containers and frees you from the horrors of flattening perfectly good business constructs to fit within the strictures of normalised database tables.

A good test for whether or not an application is taking a document-centric or a data-centric approach to data modelling is
*mixed content*. Mixed content cuts to the heart of the document-centric worldview and is famously ugly when represented in an OO-like modelling approach.

Greg says:

    "The BNF-like approch seems to suit better the higher-level speficiation (such as an order document) that may more od a real syntax rather than just being a Java-like object/data structure."

I would argue that the grammar based approach works best when applied all the way down. I think switching paradigms from container types (like tables or hash tables) to "native types" (like record/object/integer) is both unhelpful and unnecessary.

Thanks for the comments, Greg.

Jython on the move

[via The daily Python URL]. A useful list of pointers to recent Jython articles. A Jython 2.2 release is targeted for early Summer. Yummy.

Wednesday, April 21, 2004

A service oriented approach to e-government architecture

My colleague Conor O'Reilly presented a paper entitled "A service oriented approach to e-government architecture" at XML Europe this week.

I will be presenting along similar lines at the FCW government CIO conference in Florida in May and also at eGovernment confererence in Dublin in June.

Why RelaxNG for document-centric integration

Greg Wdowiak asks about advantages of RelaxNG for document-centric integration.

Some quick thoughts.

1) In a document-centric system, both documents and their schemas, are meant to be read by humans as well as machines. Relax NG - especially the compact syntax, is a joy to read. The *meaning* of the schema jumps off the page at you unaided by anything other than a text editor. For long lived notations, the ability to grok the grammer quickly and without fancy tools is very important.

W3C XML schema on the other hand, requires tools to create visualisations before you can even begin to comprehend the semantics of any reasonably sized schema.

2)Relax NG is based on a beautiful, clean formalism called Hedge Automata ( Its expressive power makes RelaxNG (thanks to tools like trang)
an excellent "pivot notation" for grammar based schemata (

3) Relax NG has a simple syntax with very few moving parts. W3C XML schema - with its multiplicity of overlapping types, multiple attribute declaration mechanisms etc. has, um, lots of moving parts. Excessive moving parts create interoperability problems. Interoperability of W3C XML schema implementations leaves a lot to be desired.

4) RelaxNG is an ISO standard (ISO/IEC 19757).

5) RelaxNG validation does not mess with your XML instance. No PSVI. Proper separation of concerns between validation and infoset augmentation.

6) RelaxNG "patterns" can be used as "non terminals" in BNF style grammars which makes schema fragment re-use very straightforward.

7) RelaxNG has an open-ended approach to data typing.

8) I've been a markup specialist every working day of my life for about 20 years and I only understand a subset of W3C XML schema. I have a pretty thick skin for complexity but I have my limits :-) Relax NG by contrast is SIMPLE.

9) W3C XML Schema attempts to take object oriented design concepts (such as inheritance) and apply them to grammars. This doesn't work well for document-centric applications. In fact, I would go so far as to say that OO approach is at odds with document-centricity.

10) Edd Dumbill's closing keynote at XML Europe included the assertion that Microsoft use RelaxNG to create schemas and convert to XSD with Trang. If its good enough for Microsoft, its good enough for me:-)

I highly recommend James Clarks analysis of Relax NG versus W3C XML schema here.


Thinking out loud about open outcry trading methods

e-trading in the e-pit.

Monday, April 19, 2004

Origins of reliable messaging

Where/when did the idea of reliable messaging arrive. MQSeries goes back to exBridge so that gets us to 1992/3. What about before then? Anybody got any pointers?

RelaxNG - quietly, with little fanfare, an unstoppable momentum gathers

Mike Fitzgerald on RelaxNG in OpenOffice. A couple of weeks ago in Cambridge I heard from Lou Burnard that RelaxNG has been adoped by the TEI.
As the world increasingly wakes up to the document-centric integration philosophy that is the only really new and interesting stuff in the SOA and Web Services shenenigans, the benefits of RelaxNG over W3C XML Schema will become increasingly apparent.