Uche Ogbuji has published a useful article of techniques for handling large XML docs.
The sadly neglected (by me) pyxie library provides "sparse trees". Simply put, you process and XML instance event-by-event until you come across the start of an element you would like to treat as a tree. Then you fire up the tree-builder purely for the sub-tree.
The also sadly neglected (by me) xpipe open source XML pipelining architecture has the concept of scatter/gather processing which is very useful for handling large XML docs.