Sean McGrath

Wednesday, September 01, 2010

what does law.gov mean to you?

Herein is my response to the question what does law.gov mean to you?

I am an IT architect and a builder of legislative systems more so than a direct legal publisher. Having said that, I have worked with most of the worlds legal publishing entities at some time or other over the last twenty years. My current focus is creating legislative systems for legislatures - mostly in the U.S.A. - the content our systems produce is then published by legislatures themselves and also by third party publishers.

I am a technologist first and foremost. I recently started blogging about the KLISS eDemocracy system here in Kansas in the hope that the technical details I am blogging will help other technologists to understand the legislative domain better and thus help create a more informed tech community around one of the most important aspects of any democracy.

I agree with pretty much everything Ed Walters said about the AOL Moment that is currently happening in the legal publishing industry. I also also agree with pretty much everything Carl Malamud says about the desirability of free, unfettered access to authenticated, machine readable primary legal materials in the context of the law.gov initiative.

For me however, the most interesting vista that law.gov opens up is the potential for the most significant event in the evolution of democracy since the funeral oration of Pericles 2400 years ago. For the first time in human history, we now have all the technological pieces we need to bring participation in the democratic process to levels not seen since ancient Greece when everyone could literally congregate in the same place. To quote Don Heiman, CITO for the Kansas State Legislature:

There are no longer any technical reasons why we cannot publish the public activities of a legislature in real-time, or have statute databases codified on the fly, or provide direct visibility of what the impact of a proposed modification to the law would look like before it gets voted on. No technical reason why we cannot allow citizens to not only observe, but also participate in the making of law *as it is being made* - not just see the results ex post facto.

It is a lot of work for sure but it is only work at this point. No new technology breakthroughs are required. What needs to happen next (and there are signs it is happening) is for the world of law and the world of software development to both come to the realization that they are both in the same business from content management and publishing perspectives. I really believe that law is source code in the sense that the disciplines and techniques that have been perfected in the software development world have a tremendous amount to offer those who manage corpora of legal texts.

I look forward to the day when we speak of, for example "release 7.8a (Rev 456422) of the consolidated statutes of Tumbolia (MD5: checksum d03730288a7f0278e36afc82f220ddab)."

I look forward to the day when we can jump into a time machine and look at Rev 674245 of the 2011 Legislative Biennium Corpus for Tumbolia in order to better understand the legislative intent of an amendatory bill.

I look forward to the day when we can look at the laws of Tumbolia, as they were at noon Wed, 20 Jan 2010 in order to present attorneys and the courts with a complete view of what the law said at the time some contested action took place.

I look forward to the day when we can detail edit-by-edit how the consolidated statutes of Tumbolia came to be what they are by starting with the Constitution of Tumbolia from 1899 and rolling forward changes to its statute from its session laws, step-by-step with all the rigor of an accounting audit trail of transaction ledgers.

I hope that the law.gov initiative heads in that direction. The http://legislation.gov.uk website clearly points the way for what is possible. Speaking as a technologist, we techies stand ready willing and able to make this happen. Is the political will there to make it happen? Is the disruption of the status quo too much too soon for such a staid and contemplative field as law and law-making? I can answer neither of these questions but I sincerely hope the answers are "yes" and "no" respectively.

The biggest threat to any democracy is a disinterested electorate. In years to come, I hope law.gov will be seen as the catalyst that re-invigorated an entire generation to engage with the democratic process. A process that too many currently feel is beyond their realm of influence. We can change that now. For our sakes and the sakes of future generations, I hope we do.

Monday, August 30, 2010

Its all about the back end

David Eaves : Creating effective open government portals. Amen to that.

Here is the thing...most http://data.[whatever] websites are only as good as their ability to serve up fresh content. That oftentimes means that re-thinking back-end processes is required. Otherwise a one-off data dump happens to get things rolling but then...

Nothing kills a web-o-data project so ruthlessly as information latency.

Machine readable content - even more so than human readable content - must be current.

Monday, August 23, 2010

Normal people, normal spreadsheets and RDF

In a post about Gridworks Jeni says:

"Like a lot of spreadsheets created by normal people, who want to create something readable by human beings rather than computers, it has some extra lines at the top to explain what the spreadsheet contains..."

There is a terribly, terribly common pattern here and it has always surprised me that spreadsheet developers have never made row 1 and col 1 "special" for exactly this reason. I've lost count of the number of spreadsheets I've seen that have labels in row 1, labels in col 1 and data in the intersection cells.

Subject, predicate, object anyone:-) Where do all the triples go?.

Monday, August 16, 2010

More on the KLISS workflow model

Last time in this KLISS series I introduced the KLISS approach to workflow and (hopefully) explained why workflow in legislative environments can get very complex indeed. I mentioned that the complexity can be tamed by zooming in on the fundamental features that all legislative workflows share. This post will concentrate on fleshing that assertion out some more.

Somebody once said that a business document such as a form, is a work flow snapshotted at a point in time. I really like that idea but I do not think that a document alone can serve as a snapshot of the workflow in all but the simplest of cases. To do that, in my opinion, you need an extra item : a set of pigeon holes.

The pigeon holes I am talking about are not just storage shelves with some sort of alphabetic or thematic sorting system. I am talking about the kind of pigeon holes that have labels on them that indicate what state the documents in each hole are in. Some classic states for documents to be in (in a legislative environment) include:

- Awaiting introduction in the Senate
- Pending engrossment into the Statute
- Bills currently being processed in the Agriculture committee
- etc.

The power of the incredibly simple, time honored pigeon hole system is too often overlooked in our database centric digital world. The electronic equivalent of these pigeon holes is, of course, nothing more complex than the concept of a file-system folder. In truth, the electronic pigeon hole is generally more powerful than its physical analog because in the electronic world, folders can trivially contain other folders to any required depth. Moreover, electronic folders can have any required capacity.

Sadly, I have rather a lot of personal experience of how this simple-yet-powerful concept of recursive, expandable folders can be "pooh poohed" by folks who think that data cannot possible be considered "managed" unless it it loaded into a database or otherwise constrained in terms of shape and volume. Oftentimes, said folks use the words "database" and "relational database" interchangeably. For such folks, the data model for a "record" is the center of the universe. Insofar as that record has workflow, the workflow is an attribute of the record – not a "place" where the record lives... This record-centric world view is oftentimes the beginning of a slippery slope in legislative informatics where designers find themselves tied up in knots trying to:

create enough state variables – fields – in the tables to capture all possible workflow states
capture all the business rules for workflow transitions in machine readable form
shred the legislative content into pieces (often-times with XML) to fit into the non-recursive, tabular slots provided by relational databases
re-assemble the shredded pieces to re-constitute working documents for publication

I do not subscribe to this record-centric model. It works incredibly well when record structures are simple, workflows are finite and record inter-dependencies are few. That is not the world we live in in legislative informatics. Legislative content is messy, hierarchical, time-oriented and often densely interlinked. Relational databases are just not a good fit either for the raw data or for the workflows that work on that raw data. Having said all that, I hasten to point out that ye-olde recursive folder structure on its own is not a perfect fit either. There are two main missing pieces.

Firstly, as I've said before, legislative informatics is all about how content changes over time and the audit trail that allows the passage through time to be accessed on demand. Out-of-the-box recursive file systems do not provide this today. (Aside: those with long memories may remember Digital Equipment Corporations VMS operating system. It was the last mainstream operating system to transparently version files at the operating system level.).

Secondly, legislative informatics is heavily event-oriented. i.e. when an event happens, entire sets of sub-sequent events are kicked off, each of which is likely to create more events which may in turn, create more events... Out-of-the-box recursive file systems do not provide this easily today. i.e. a way of triggering processing based on file-system transaction events (Yes, you can do it at a very low level with device driver shenanigans and signals but its not for the faint of heart).

To address these two short-comings of a classic folder structure for use as a workflow substrate, the KLISS model added two extra dimensions.

Imagine a system of recursive pigeon holes that starts empty and then remembers all Create/Read/Update/Delete/Lock operations of pigeon holes and of the documents that flow through them
Imagine a system of recursive pigeon holes in which each hole carries a complete history of everything that has ever passed through it (including other pigeon holes)
Imagine a system of recursive pigeon holes in which each hole can trigger any required data processing at the point where new content arrives into it.

The first two items above are provided by the time-machine that I have previously talked about. The last one is what we call the Active Folder Framework in KLISS. The best way to explain it is perhaps by analogy with a workflow system realized with a good old fashioned set of physical pigeon holes. Consider this example:

A new bill is introduced in the House. The requested bill draft is acquired from the sponsor (or perhaps legislative council) and placed in the "introduced" pigeon-hole. This event kicks off the creation of an agenda item where the initial fate of introduced bill will be discussed. That agenda item is lodged in the "pending agenda items" pigeon hole. Later, when the order of business gets to it, items from the "introduced" pigeon hole are taken out and considered. They may go back into that pigeon hole or be moved to pigeon holes specific to particular committees.
KLISS - and more generally the Legislative Enterprise Architecture that underlines it - operates like that. Workflow items - documents - are moved around named folders. Every move is audit-trailed in the time machine. Every time something is changed, events are fired so that down-stream processes that update their internal views of what the pigeon-holes represent. In KLISS all the workflow folders are "active" in the sense that they are not just passive place-holders for work artifacts. Putting something into a folder triggers an event. Taking something out triggers and event etc. Moreover, the event processors have access to the pigeon-hole structure so that the event-processors can create new work artifacts and move them around...this triggering more events. The event processors can even trigger the creation of new folders and new event processors!

The combination of (a) recursive named folders, (b) time machine audit trail and (c) event propagation covers a tremendous amount of ground. These are the three "pillars" on top of which, most of KLISS is built. Internally in Propylon we call them the Three pillars of Zen or TPOZ for short.

At a business level, there are some very attractive upshots to this model.

The abstraction that the end-users interact with is a very familiar one. Files in folders...All the time machine and event propagation machinery is transparent to end-users.
Ad-hoc workflows can be very easily accommodated without custom programming. Just create some folders and shunt work through them. The audit-trail will continue to be rigorous and the event-propagation will continue to function even for workflows created on the fly by staff operating under pressure (i.e. the House has just suspended the rules and is now about to do X...)
Automation can be added incrementally. i.e. if workflow step X is currently manual, the entire workflow can be put in place now and manual steps can be automated over time. The system as a whole operates on the basis that all active folder processing is asynchronous in nature. i.e. we assume that there is a non-deterministic delay for each workflow action. The net result of automating any given folder in KLISS is simply that its associated workflow steps simple get faster over time. Nothing else in the system changes.
Workflows have autonomic characteristics. For example, an interface to a voting board may malfunction because of a network error. The result would be that an active folder (an automated workflow step) ceases to be active. No problem, simply revert to the manual processing of the electronic voting documents i.e. fill in the vote forms to create new vote items. Remember : the complete audit trail and event machinery is still working away under the hood. Everything else in the system will continue to function unaffected by the point-failure of one component.

Perhaps the most subtle aspect of the workflow model to grasp is the asynchronous nature of it all. I wrote earlier about naming things with rigid designators in KLISS and that is critical to workflow processing as is the consistency model. Each active folder processor works to its own concept of time, always referring to content in the system via point-in-time URLs that lock down – snapshot – the entire repository as it was at that moment in time. Events that happen in the repository are queued up for consumption by active folder processors. If a processor is slow or goes offline for an upgrade, no problem, the event messages are queued up to be processed whenever the active folder comes back on line.

In summary, KLISS models workflows by extending the familiar pigeon hole abstraction with temporal and event-oriented dimensions. In terms of formalisms in systems theory, it is perhaps closest to Petri nets in which the "tokens" moving between states are information-carrying objects such as digital bill jackets or votes or explanatory memoranda.

So far, pretty much everything I have discussed in this KLISS series has been server-side focused. The next few posts will be client side focused. Next up: author/edit sub-systems in legislative environments.

New office in Lawrence, Kansas

Well, today I did the paperwork for our new office in Lawrence, Kansas. We move in start of September. Looking forward to further establishing relationship with various KU schools : Engineering, Law etc.

Its all about the smudges

There is a profound issue underlying this article on Documentation capturing from a legal perspective.

Unless we find ways of preserving work-in-progress in our digital world we will be the first major civilization to leave no traces behind the great intellectual works it produced. No pentimenti for the visual arts. No Scribbledehobbles for the literary arts.

This is not just a tweedy humanities issue. Fastidious recording of how written works come to say what they say, needs to be a central concern of democracy. Without it, there is no transparency. Democracy and the rule of law cannot work without transparency. A corpus of law is a bit like a humongous novel but unlike literary novels, it never gets finished. It is always a work in progress.

Friday, August 13, 2010

Lefty Day

Today is lefty day. Thanks to James Tauber for reminding me.

Personally, I don't mind the right-oriented college desks or the right-oriented scissors or the right-oriented tin opener. What really irks me is not being able to walk into a music shop and pick up the guitars or the banjos or the mandolins...