Monday, September 06, 2010

History in the context of its creation

Tim O'Reilly's Twitter feed pointed me at this great piece on historiography.

I just love the 12 volume set of the evolution of a single Wikipedia entry. In KLISS we take a very historiographic approach to eDemocracy.

The primary difference in the way we do it to the Wikipedia model is that we record each change - delta - as a delta against the entire repository of content : not just against the record modified. Put another way, we don't version documents. We version repositories of documents.

In legislative systems, this is very important because of the dense inter-linkages between chunks of content. To fully preserve history in the context of its creation, you need to make sure that all references are "point-in-time" too. I.e. if you jump back into the history of some asset X and it references asset Y, you need to able to follow the link to Y and see what it looked at *at the same point in history*.

Obviously, this is only practical for repositories with plausible ACID semantics. I.e. each modification is a transaction. It would be great if the universe was structured in a way that allowed transaction boundaries for the Web as a whole but of course, that is not the way the Universe is at all :-) - And I'm not for a minute suggesting we should even try!

Having said that, versioning repositories is a darned sight more useful than versioning documents in many problem domains : certainly law and thus eDemocracy which is my primary interest i.e. facilitating access, transparency, legislative intent, e-Discovery, forensic accounting and the like.

The fact that supporting these functions entails a fantastic historical record - a record that future historians will likely make great use of - is, um, a happy accident of ths history we are currently writing.


Anonymous said...


I've always found the self modifying acts fascinating. Sections which repeal part of another act, which are themselves repealed by part of itself, yet the repeal continues to function!

Ie in victoria, australia, the abortion law reform act (alra) repealed part of the crimes act upon commencement. The sections which did this repealing, were themselves repealed 1 year later by a part of the alra. The section of the alra which carried out the repeal, repealed itself!

Probably very common, but came across this one todau.

Kill the Tom said...

What is the effect of versioning on storage space? If X is the size of a non-versioned repository and Y is the size of a versioned repository, what is the relationship of X to Y? Linear (Y=2X)? Quadratic (Y=X^2)?

Trying to imagine what the effect would be if the entire internet was versioned in this manner.

Sean said...

@Kill the Tom,

There is an impact on storage but it is not as pronounced as might initially be thought due to the space-efficiency of delta-oriented algorithms. See for example, the delta editor algorithms used in svn (