Monday, August 02, 2004

Handling big XML files

Uche Ogbuji has published a useful article of techniques for handling large XML docs.

The sadly neglected (by me) pyxie library provides "sparse trees". Simply put, you process and XML instance event-by-event until you come across the start of an element you would like to treat as a tree. Then you fire up the tree-builder purely for the sub-tree.

The also sadly neglected (by me) xpipe open source XML pipelining architecture has the concept of scatter/gather processing which is very useful for handling large XML docs.

No comments: