Featured Post
These days, I mostly post my tech musings on Linkedin. https://www.linkedin.com/in/seanmcgrath/
Friday, July 09, 2010
Tuesday, July 06, 2010
The end of print for law?
Bob Berring muses on the future of print for law and references the Book of Kells and Newgrange...
In the magnificent long room in my Alma Mater, Trinity College Dublin, the book of Kells is on display and shockingly legible. By that I mean that it is a lot more legible than the text in the Wordstar files on the CP/M-based 8 inch floppies in my basement. Even if I could read them (which I can't) they wouldn't be "real" in the sense that the real files were on other floppies that were used to create replicas. In the digital world, no document is ever "real" in the way that the Book of Kells is real. Everything is a best-efforts replica of something which is itself a replica...all the way down to what you saw on the screen at the moment of content creation, inter-mediated by an operating system, then a software application, then a display device driver...This is deeply worrying stuff if you are trying to write down content for the ages : be it sacred texts or legal texts. I spend a goodly amount of my time these days thinking about this in the context of law, law.gov, data.gov and of course, the KLISS project.
It is fitting I think, to ponder this stuff and how it relates to law, in the Irish countryside because the Irish played an instrumental role in the creation of copyright law many, many moons ago. Cooldrumman, the location of the battle, is close to my house in Sligo, Ireland.
In the magnificent long room in my Alma Mater, Trinity College Dublin, the book of Kells is on display and shockingly legible. By that I mean that it is a lot more legible than the text in the Wordstar files on the CP/M-based 8 inch floppies in my basement. Even if I could read them (which I can't) they wouldn't be "real" in the sense that the real files were on other floppies that were used to create replicas. In the digital world, no document is ever "real" in the way that the Book of Kells is real. Everything is a best-efforts replica of something which is itself a replica...all the way down to what you saw on the screen at the moment of content creation, inter-mediated by an operating system, then a software application, then a display device driver...This is deeply worrying stuff if you are trying to write down content for the ages : be it sacred texts or legal texts. I spend a goodly amount of my time these days thinking about this in the context of law, law.gov, data.gov and of course, the KLISS project.
It is fitting I think, to ponder this stuff and how it relates to law, in the Irish countryside because the Irish played an instrumental role in the creation of copyright law many, many moons ago. Cooldrumman, the location of the battle, is close to my house in Sligo, Ireland.
Saturday, July 03, 2010
KLISS, law and eDemocracy
I am roughly half way through my high level description of KLISS and the Legislative Enterprise Architecture that underpins it. It is the eve of the 4th of July independence day celebrations as I write this. It seems like an appropriate moment to step back from the detail a little and look at the bigger picture.
As a specialist in legal informatics, I cannot help but think of this historic time in terms of America's foundational documents, without which, the great enterprise known as "democracy and the rule of law" would simply not be possible. Missing my homeland of Ireland as I do from time-to-time; sitting in my home in Lawrence, Kansas; I cannot help but be drawn to the involvement of some generally forgotten Irish people in the events of 1776.
The Dunlap Broadside, the first printed copies of the declaration of independence, were produced by an Irishman John Dunlap in 1776. Of the eight foreign-born signatories of the declaration, three where Irish: James Smith, George Taylor and Matthew Thorntorn.
I cannot help but marvel at the fact that 27 of the original 200 or so copies still exist. So too, of course, does the *real* declaration in the form of the engrossed parchment prepared by Timothy Matlack. It was itself copied from the drafts produced by the founding fathers on, (probably) hemp paper of some description.
If you have been following along in this KLISS series you will probably be sensing where I am going with this. The drafts, the engrossed version, the promulgated copies...establishing the relationships between these artifacts is critical to establishing the laws/regulations of the land. It is critical because there can be - and there often is - ambiguity and room for disagreement as to what the law actually means. Law is a very complicated business after all. As a society, we can find ways to deal with that complexity as long as there is no ambiguity as to what the law actually says in terms of the text of the language itself. Once we have that, at least we are all arguing (or zealously advocating) different takes on the same thing. If we start arguing for different takes on different things, chaos reigns.
In the case of the declaration, thankfully, we are in good shape. The Dunlap broadsides are unambiguously copies, not the original. The hemp drafts of Thomas Jefferson are "just" drafts (fantastically important for historical research but not the real thing from a legal perspective). The real thing is the engrossed parchment prepared by Timothy Matlack, and signed by each of the founding fathers. That is why, for example, debates about the accuracy of the Jefferson memorial can be resolved. The placement of commas can be compared with the for-reference, original : the parchment. As for whether or not Jefferson intended "inalienable" rather than "unalienable", the intent is something we can and should be able to argue over in a civil society as long as we can look at the engrossed version and see one or the other unambiguously present.
The ancient Romans seemed to understand the importance of non-ambiguity of legal text well. Although they had early forms of paper, knew how to write on animal skin and knew how to make clay tablets, they chose to "engross" their foundational legal text : the Twelve tables by engraving them on ivory. Something that would withstand fire better than paper. Is harder to tamper with than a clay tablet, smudge resistant...
Removing ambiguities as to the for-reference original text of law is vital for another reason. Law, although it is not expressed mathematically or interpreted via formal logic, is very much based on mathematical concepts: induction, deduction, the law of the excluded middle, contravalence etc. In particular, it shares with mathematics the concept of axioms : foundational, self evident truths from which further truths can be derived and against which assertions of truth can be tested.
Historical documents show that both Jefferson and Adams were familiar with Euclid's Axioms, as was Abraham Lincoln. The Euclidian overtones in phrases like "We hold these truths to be self-evident" (Declaration of Independence) and "...dedicated to the proposition that all men are created equal." (Gettysburg address) are striking indeed.
It is very easy to arrive at bad results in mathematics if your starting assumptions – the axioms – are wrong. So too in Law. Law builds on itself just as mathematics builds on itself. It is accretive. Thanks to legal principles like stare decisis interpretation of the law is itself accretive because caselaw builds on caselaw...any ambiguities that creep into the vast self-supported edifice of law is bad for the rule of law. (I hold that to be self-evident:-)
Looking back at the history of law and the history of democracy, I think we have reached an inflection point. Something *big* is about to happen I suspect. I am not sure what shape it will take but here are the drivers as I see them:
Now into this world, over the last two decades or so, comes the Internet and the Web in particular. It has so much to offer the world of law (and the world of democracy) that tensions between the "old world" and the new are mounting fast.
A quiet revolution is taking shape. Citizens are now armed with their knowledge of instantaneous publishing via Blogging or Google docs or Facebook. They are armed with knowledge of instantaneous search via Google or Bing. They are armed with knowledge of instantaneous revision with revision history via Wikipedia. They are armed with knowledge of hyperlinks for instantaneous follow-up of citations. They expect video to be instantly available on Youtube or blip.tv...When these citizens look at how laws/regulations are made today and how formal meetings are conducted today and how content that should be free (i.e. the laws/regulations of the land) is either hidden behind paywalls or only available in hard copy or buried deep inside large PDFs or 2 weeks out of date...
Something has got to give. Especially if you tell these citizens that they must abide by all these laws/regulations. Also, because they live in a participative democracy, they can get involved in shaping those laws and are entitled to free an unfettered access to the process of making law...The gulf between the feature-set of the Web-world for this sort of activity i.e. participation & publishing versus the existing "feature-set" of the status quo for law/regulation-making is so striking.
It seems to me that the world of law is somewhat like the worlds of news or music or of TV. For many years they fought against the Internet but have now finally started to embrace it. The Internet is an amazing force. So far the number of areas of human endeavor that have resisted its advances successfully stands at 0 and counting. I believe that the world of law/regulation-making is next up for a significant, world changing transition to the Web. It certainly is not as sexy as the world of music or sports news or TV shows but in a democracy, I cannot think of any one thing that is more important. I cannot think of anything that should be more free than the law and the ability to participate in its creation.
Although I am overwhelmingly positive in my outlook on what the Web will do for law and for democracy, there are some negatives. My primary concern is in the area of reference copies of the law. That concern I hope is evident from my opening remarks in this post. The reference copy of law is no longer etched onto ivory or engrossed onto animal skin. The sheer volume of law make that impossible anyway. In recent decades acid free paper and non-fugitive inks and master-copies kept in safes in the offices of Secretaries' of State, have substituted.
Nowadays, many law-producing entities such as legislatures/parliaments, agencies, courts are moving away from having heads of state sign or initial vellum sheets towards treating electronic legal artifacts as authentic. This, quite frankly, scares me as I believe I know enough about technology to know all the possible ways in which digital data can be compromised between producer and consumer and can degrade over time. (I talked about some of them earlier in this KLISS series.)
The folks who are making this transition to digital are are well intentioned and are seeking to take advantage of the Web to better serve their citizens. I'm all for that obviously. However, I do worry that the language of information technology creates incorrect assumptions in the minds of those not versed in the details of how digital machines actually work. A digital signature is really nothing like a real signature. An e-mail really is not like snail-mail at all because nothing ever gets sent. Everything is a copy – with all the issues that copies brings...The word "authentic" is so much more slippery in a digital world.
Having sounded that note of caution, let me end by saying I truly believe we live in profound times from the perspective of democracy. The Web can - and will - fundamentally change how we think about participative democracy and the process of making laws and regulations. We now have all the individual pieces of technology (I have mentioned most of them already in this KLISS series) we need. No new breakthrough algorithms or devices are required. We just need to assemble everything coherently. It is now a matter of design - not a matter of research.
We are on a fascinating road to a different world, we will get there via some disruptive technologies and disruptive memes. Not everyone will be best pleased but if the history of the internet tells us anything it is that resistance - once all the stars are aligned - is futile. Better to be part of it rather than fight against it. Better to help shape it and drive it forward than simply react to it.
In KLISS, I have been lucky enough to contribute to an initiative that strives to fully embrace technology for the betterment of democracy and the transparent making of law that it depends on.
I look forward to doing my bit going forward to ensure that the compelling vision of KLISS is realized and sharing the design and our experiences with anybody who is interested in it.
Next up: The KLISS workflow model
As a specialist in legal informatics, I cannot help but think of this historic time in terms of America's foundational documents, without which, the great enterprise known as "democracy and the rule of law" would simply not be possible. Missing my homeland of Ireland as I do from time-to-time; sitting in my home in Lawrence, Kansas; I cannot help but be drawn to the involvement of some generally forgotten Irish people in the events of 1776.
The Dunlap Broadside, the first printed copies of the declaration of independence, were produced by an Irishman John Dunlap in 1776. Of the eight foreign-born signatories of the declaration, three where Irish: James Smith, George Taylor and Matthew Thorntorn.
I cannot help but marvel at the fact that 27 of the original 200 or so copies still exist. So too, of course, does the *real* declaration in the form of the engrossed parchment prepared by Timothy Matlack. It was itself copied from the drafts produced by the founding fathers on, (probably) hemp paper of some description.
If you have been following along in this KLISS series you will probably be sensing where I am going with this. The drafts, the engrossed version, the promulgated copies...establishing the relationships between these artifacts is critical to establishing the laws/regulations of the land. It is critical because there can be - and there often is - ambiguity and room for disagreement as to what the law actually means. Law is a very complicated business after all. As a society, we can find ways to deal with that complexity as long as there is no ambiguity as to what the law actually says in terms of the text of the language itself. Once we have that, at least we are all arguing (or zealously advocating) different takes on the same thing. If we start arguing for different takes on different things, chaos reigns.
In the case of the declaration, thankfully, we are in good shape. The Dunlap broadsides are unambiguously copies, not the original. The hemp drafts of Thomas Jefferson are "just" drafts (fantastically important for historical research but not the real thing from a legal perspective). The real thing is the engrossed parchment prepared by Timothy Matlack, and signed by each of the founding fathers. That is why, for example, debates about the accuracy of the Jefferson memorial can be resolved. The placement of commas can be compared with the for-reference, original : the parchment. As for whether or not Jefferson intended "inalienable" rather than "unalienable", the intent is something we can and should be able to argue over in a civil society as long as we can look at the engrossed version and see one or the other unambiguously present.
The ancient Romans seemed to understand the importance of non-ambiguity of legal text well. Although they had early forms of paper, knew how to write on animal skin and knew how to make clay tablets, they chose to "engross" their foundational legal text : the Twelve tables by engraving them on ivory. Something that would withstand fire better than paper. Is harder to tamper with than a clay tablet, smudge resistant...
Removing ambiguities as to the for-reference original text of law is vital for another reason. Law, although it is not expressed mathematically or interpreted via formal logic, is very much based on mathematical concepts: induction, deduction, the law of the excluded middle, contravalence etc. In particular, it shares with mathematics the concept of axioms : foundational, self evident truths from which further truths can be derived and against which assertions of truth can be tested.
Historical documents show that both Jefferson and Adams were familiar with Euclid's Axioms, as was Abraham Lincoln. The Euclidian overtones in phrases like "We hold these truths to be self-evident" (Declaration of Independence) and "...dedicated to the proposition that all men are created equal." (Gettysburg address) are striking indeed.
It is very easy to arrive at bad results in mathematics if your starting assumptions – the axioms – are wrong. So too in Law. Law builds on itself just as mathematics builds on itself. It is accretive. Thanks to legal principles like stare decisis interpretation of the law is itself accretive because caselaw builds on caselaw...any ambiguities that creep into the vast self-supported edifice of law is bad for the rule of law. (I hold that to be self-evident:-)
Looking back at the history of law and the history of democracy, I think we have reached an inflection point. Something *big* is about to happen I suspect. I am not sure what shape it will take but here are the drivers as I see them:
- The volume of law – including all the material used in adjudicating on and practicing law - is growing exponentially.
- In practice, because of the sheer volume (and some other reasons) the copies of law used in the practice of law and cited in court are often "owned" by commercial third parties who amass all the material into private repositories.
- Even if the text of legal materials is not owned/claimed by a commercial entity, the citation mechanisms can be. E.g. page numbers of case law publications or consolidations/re-statements of specific areas of law.
Now into this world, over the last two decades or so, comes the Internet and the Web in particular. It has so much to offer the world of law (and the world of democracy) that tensions between the "old world" and the new are mounting fast.
A quiet revolution is taking shape. Citizens are now armed with their knowledge of instantaneous publishing via Blogging or Google docs or Facebook. They are armed with knowledge of instantaneous search via Google or Bing. They are armed with knowledge of instantaneous revision with revision history via Wikipedia. They are armed with knowledge of hyperlinks for instantaneous follow-up of citations. They expect video to be instantly available on Youtube or blip.tv...When these citizens look at how laws/regulations are made today and how formal meetings are conducted today and how content that should be free (i.e. the laws/regulations of the land) is either hidden behind paywalls or only available in hard copy or buried deep inside large PDFs or 2 weeks out of date...
Something has got to give. Especially if you tell these citizens that they must abide by all these laws/regulations. Also, because they live in a participative democracy, they can get involved in shaping those laws and are entitled to free an unfettered access to the process of making law...The gulf between the feature-set of the Web-world for this sort of activity i.e. participation & publishing versus the existing "feature-set" of the status quo for law/regulation-making is so striking.
It seems to me that the world of law is somewhat like the worlds of news or music or of TV. For many years they fought against the Internet but have now finally started to embrace it. The Internet is an amazing force. So far the number of areas of human endeavor that have resisted its advances successfully stands at 0 and counting. I believe that the world of law/regulation-making is next up for a significant, world changing transition to the Web. It certainly is not as sexy as the world of music or sports news or TV shows but in a democracy, I cannot think of any one thing that is more important. I cannot think of anything that should be more free than the law and the ability to participate in its creation.
Although I am overwhelmingly positive in my outlook on what the Web will do for law and for democracy, there are some negatives. My primary concern is in the area of reference copies of the law. That concern I hope is evident from my opening remarks in this post. The reference copy of law is no longer etched onto ivory or engrossed onto animal skin. The sheer volume of law make that impossible anyway. In recent decades acid free paper and non-fugitive inks and master-copies kept in safes in the offices of Secretaries' of State, have substituted.
Nowadays, many law-producing entities such as legislatures/parliaments, agencies, courts are moving away from having heads of state sign or initial vellum sheets towards treating electronic legal artifacts as authentic. This, quite frankly, scares me as I believe I know enough about technology to know all the possible ways in which digital data can be compromised between producer and consumer and can degrade over time. (I talked about some of them earlier in this KLISS series.)
The folks who are making this transition to digital are are well intentioned and are seeking to take advantage of the Web to better serve their citizens. I'm all for that obviously. However, I do worry that the language of information technology creates incorrect assumptions in the minds of those not versed in the details of how digital machines actually work. A digital signature is really nothing like a real signature. An e-mail really is not like snail-mail at all because nothing ever gets sent. Everything is a copy – with all the issues that copies brings...The word "authentic" is so much more slippery in a digital world.
Having sounded that note of caution, let me end by saying I truly believe we live in profound times from the perspective of democracy. The Web can - and will - fundamentally change how we think about participative democracy and the process of making laws and regulations. We now have all the individual pieces of technology (I have mentioned most of them already in this KLISS series) we need. No new breakthrough algorithms or devices are required. We just need to assemble everything coherently. It is now a matter of design - not a matter of research.
We are on a fascinating road to a different world, we will get there via some disruptive technologies and disruptive memes. Not everyone will be best pleased but if the history of the internet tells us anything it is that resistance - once all the stars are aligned - is futile. Better to be part of it rather than fight against it. Better to help shape it and drive it forward than simply react to it.
In KLISS, I have been lucky enough to contribute to an initiative that strives to fully embrace technology for the betterment of democracy and the transparent making of law that it depends on.
I look forward to doing my bit going forward to ensure that the compelling vision of KLISS is realized and sharing the design and our experiences with anybody who is interested in it.
Next up: The KLISS workflow model
Thursday, July 01, 2010
The Point-in-time issue. A stock exchange example
In a recent post, I talked about the importance of temporal decoupling and point-in-time stamping of data in our increasingly lightening-fast-yet-fundamentally-asynchronous world...
In that context, this post about the recent stock market flash crash is interesting.
In that context, this post about the recent stock market flash crash is interesting.
Tuesday, June 29, 2010
Data models, data organization and why the search for the "correct" model is doomed
I have received some e-mails about my assertion that there is no such thing as the "correct" way to model anything in a computer system. I.e. no "pure" model that does not gain its correctness status via mere engineering concerns such as fitness-for-purpose.
My argument boils down to this:
- to model anything in software you need a human
- that human needs to carve up reality in some way in order to create a model. I.e. name things, classify things, link things to other things, distinguish causes and effects, distinguish entities from actions, declare some aspects of reality "unimportant", create a model boundary etc.
- no two humans carve up reality in exactly the same way as we are all unique creatures whose view of the world is influenced by our language, culture, experiences etc.
- therefore, no two models are likely to be exactly the same
- even if they appeared to be the same, there is no way to be sure because human language is lossy. I.e. there is no way to be sure that the model I have in my head is what I have communicated through language. As Wittgenstein said, some things cannot be said - they can only be shown. In Zen terms, our words are just fingers pointing at the moon.
The best book I have read on this subject - highly recommended - is Bill Kent's Data and Reality.
Kent looks at the world from a relational database perspective. A couple of articles from my scribenatorial past might be of interest..They look at the world from a - surprise - XML perspective:
Next up: KLISS, Law and eDemocracy.
My argument boils down to this:
- to model anything in software you need a human
- that human needs to carve up reality in some way in order to create a model. I.e. name things, classify things, link things to other things, distinguish causes and effects, distinguish entities from actions, declare some aspects of reality "unimportant", create a model boundary etc.
- no two humans carve up reality in exactly the same way as we are all unique creatures whose view of the world is influenced by our language, culture, experiences etc.
- therefore, no two models are likely to be exactly the same
- even if they appeared to be the same, there is no way to be sure because human language is lossy. I.e. there is no way to be sure that the model I have in my head is what I have communicated through language. As Wittgenstein said, some things cannot be said - they can only be shown. In Zen terms, our words are just fingers pointing at the moon.
The best book I have read on this subject - highly recommended - is Bill Kent's Data and Reality.
Kent looks at the world from a relational database perspective. A couple of articles from my scribenatorial past might be of interest..They look at the world from a - surprise - XML perspective:
Next up: KLISS, Law and eDemocracy.
Saturday, June 26, 2010
KLISS: Organizing legislative material in legislatures/parliaments
Last time in this KLISS series, I talked about the event model in KLISS. I also talked about how it works in concert with the "time machine" model to achieve information consistency in all the "views" of legislative information required for a functioning legislature/parliament. For example a bill statute view, a journal view, a calendar view, and amendment list view, a committee view etc...
I am using the word "view" here is a somewhat unusual way so I would like today to explain what I mean by it. Doing that will help set the scene for an explanation of how legislative/parliamentary assets are organized in the KLISS repository and how metadata-based search/retrieval over the repository works.
It goes without saying (but I need to say it in order to communicate that it need not be said (ain't language wonderful?)), that legislatures/parliaments produce and consume vast amounts of information, mostly in document form. What is the purpose of the documents? What are they for really? In my view, they serve as snapshot containers for the fundamental business process of legislatures/parliaments, which is the making of law. In other words, a document in a legislature is a business process, snapshotted, frozen at a point in time.
By now, if you have been reading along in this KLISS series, you will know that it is very much a document-centric architecture. The documents themselves, in all their presentation-entangled, semi-structured glory, are treated as the primary content. We create folders, and folders inside folders. We create documents with headings and headings inside headings and we put these into folders. We then blur the distinction between folder navigation (inter-document) and heading "outline" navigation (intra--document) so that the whole corpus can be conceptualized as a single hierarchical information store. The entire state of a legislature/parliament, is in KLISS, *itself* a document – albeit a very large one! Simply put, KLISS does not care about the distinction between a folder and a heading. They are both simply hierarchical container constructs.
In KLISS a "view" is simply a time-based snapshot generated from the enormous document that is the repository, seen at a point in time, in some required format. So, a PDF of a bill is such a snapshot view. So too is a the HTML page of a committee report, a journal, a corpus of promulgated law etc. HTML, PDF, CSV, there are all the same in the KLISS information model. They are just views, taken at a point in time, out of the corpus as a whole.
Earlier in this series I talked about how the web blurs the distinction between naming something to pick it out and performing a query to pick it out. KLISS takes advantage of that blurring in the creation of views. So much so that a consumer of a KLISS URI cannot tell if the resource being picked out is "really there" or the result of running a query against the repository.
The hierarchical information model in KLISS has been strongly influenced by Hebert Simon and his essay The Architecture of Complexity. The view/query model is a sort of mashup of ideas from Bertrand Russell (proper nouns as query expressions) and John Kripke (rigid designators) combined with the Web Architecture of Sir Tim Berners Lee.
The most trivial views over the KLISS repository are those that correspond to real bytes-on-the-disk documents. Bills are generally like that. So too are votes. So too are sections of statute. Another level of views are those generated dynamically by assembling documents into larger documents. Volumes of statute are like that. Journals are like that. Once assembled, these documents often go back into the repository as real bytes-on-the-disk documents. This creates a permanent record of the result of the assembly process but it also allows the assemblies to be, themselves part of further assemblies. Permanent journals are like that. Final calendars are like that. Chronologies of statutes are like that.
Yet another level of views are those generated from the KLISS meta-data model...In KLISS, any document in the system can have any number of property/value pairs associated with it. When transactions are stored in the repository, these property/value pairs are loaded into a relational database behind the scenes. This relational database is used by the query subsystem to provide fast, ordered views over the repository. The sort of queries enabled are things like:
Give me all the bill amendments tabled between dates X and Y
Give me all the sponsors for all bills referred to the Agriculture committee last session
Give me all bills with the word "consolidation" in their long titles
How many enrolled bills have we so far this session?
etc.
At this point I need to point out that although we use a relational database as the meta-data indexer/query engine in KLISS, we do not use it relationally. This is by design. At this core level of the persistence model, we are not modeling relationships *between* documents. Other levels provide that function (we will get to them later on.). Effectively what we do is utilize a Star schema in which (URI+Revision Number) is the key used to join together all the metadata key, value pairs. The tabular structure of the meta-data fields is achieved via a meta-modeling trick in which the syntax of the field name, indicates what table and what field and what field type should be used for the associated value. In the future, we expect that we will gravitate away from relational back-ends into more non relational stores that are thankfully, finally, beginning to become commonplace.
It is important to note that in KLISS, the meta-data database is not a normative source of information. The master copy of all data is, at all times in the documents themselves. The metadata is stored in the documents themselves (the topic of an upcoming post). The database is constructed from the documents in order to serve search and retrieval needs. That is all. In fact, the database can be blown away and simply re-created by replaying the transactions from the KLISS time machine. I sometimes explain it by saying we use a database in the same way that a music collection application might use a database. Its purpose is to facilitate rapid slicing/dicing/viewing via meta-data.
This brings me to the most important point about how information is organized in KLISS. Lets step all the way back for a moment. Why do us humans organize stuff at all? We organize in order to find it again. In other words, organization is not the point of organization. Retrieval is the point of organization. Organization is something we do now, in anticipation of facilitating retrieval in the future. For most of human history, this has meant creating an organizational structure and packing stuff physically into that structure. Shoe closets, cities, pockets, airplanes, filing cabinets, filo-faxes, bookshelves, dewey decimal classification...
As David Weinberger explains in his book "Everything is Miscellaneous", there is no need for a single organizational structure for electronic information. A digital book does not need exactly one shelf on one wall, classified under one dominant heading. It can be on many shelfs, on may walls under many headings, in many ontologies, all at the same time. In fact, it can be exploded into pieces, mashed up with other books and represented in any order, in any format, any where and any time. Not only is this possible thanks to IT, it cannot be stopped. All known attempts – and their have been numerous – since the dawn of IT have failed to put the organization genie back in the bottle...
Having said that, the tyranny of the dominant decomposition appears, per Herbert Simon to be woven into the fabric of the universe. In order to store information – even electronically - we must *pick* at least some organizational structure to get us started. At the very least, things need to have names right? Ok. What form will those names take...Ten minutes into that train of thought and you have a decomposition on your hands. So what decomposition will be pick for our legislative/parliamentary materials? Do committees contain bills or do bills contain committees? Is a joint committee part of the house data model or part of the senate data model or both? Are bill drafts stored with the sponsor or with the drafter? Are committee reports part of the committee that created them or part of the bills they modify? etc. etc...One hour later, you are in a mereotopology induced coma. You keep searching for the perfect decomposition. If you are in luck, you conclude that there is no such thing as the perfect decomposition and you get on with your life. If you are unlucky, you get drafted into a committee that has to decide on the correct decomposition.
Fact of life: If there are N people in a group tasked with deciding an information model, there are exactly N, mutually incompatible models vying for dominance and each of the N participants is convinced that the other N-1 models are less correct than their own. Legislatures/parliaments provide and excellent example of this phenomenon. Fill a room with drafting attorneys, bill status clerks, journal clerks, committee secretaries, fiscal analysts and ask each of them to white-board their model of, for example bills, you will get as many models as there are people in the room.
That is why, in KLISS, by design, the information model – how it carves up into documents versus folders, paragraphs versus meta-data fields, queries versus bytes-on-the-disk does not really matter. Just pick one! There are many, many models that can work. Given a set of models that will work, there is generally no compelling reason to pick any particular one. In legislatures/parliaments – as in many other content-centric applications the word "correct" needs a pragmatic definition. In KLISS, we consider an information model to be "correct" if it supports the efficient, secure production of the required outputs with the required speed of production. That is essentially it. Everything else is secondary and much of it is just mereotopology.
Two more quick things before I wrap up for today. You may be thinking, "how can a single folder structure hope to meet the divergent needs of all the different stakeholders who likely have different models in their head for how the information should be structured?" The way KLISS does it is that we create synthetic folder structures – known as "virtual views" – over the physical folder structure. That allows us to create the illusion – on a role by role basis – that each group's preferred structure is the one the system uses :-)
As well as helping to create familiar folder structures on a role-by-role basis, virtual views also allow us to implement role based access control. Every role in the system uses a virtual view. Moreover, all event notifications use the virtual views and all attempted access to assets in the repository are filtered through the users virtual view - that includes all search results.
To sum up...KLISS uses a virtualized hierarchical information model combined with property/value pairs arranged in a star-schema fashion. Properties are indexed for fast retrieval and based on scalar data types that we leverage for query operators e.g. date expression evaluation, comparisons of money amounts etc. The metadata model is revision based and the repository transaction semantics guarantee that the metadata view is up to date with respect to the time machine view at all times. All event notifications use the virtual view names for assets.
You may be wondering, "is it possible to have a document with no content other than metadata?". The answer is "yes". That is exactly how we reify non-document concepts like committees, members, roles etc. into document form for storage in the time machine. Yes, in KLISS, *everything* is a document:-)
Next up: Data models, data organization and why the search for the "correct" model is doomed.
I am using the word "view" here is a somewhat unusual way so I would like today to explain what I mean by it. Doing that will help set the scene for an explanation of how legislative/parliamentary assets are organized in the KLISS repository and how metadata-based search/retrieval over the repository works.
It goes without saying (but I need to say it in order to communicate that it need not be said (ain't language wonderful?)), that legislatures/parliaments produce and consume vast amounts of information, mostly in document form. What is the purpose of the documents? What are they for really? In my view, they serve as snapshot containers for the fundamental business process of legislatures/parliaments, which is the making of law. In other words, a document in a legislature is a business process, snapshotted, frozen at a point in time.
By now, if you have been reading along in this KLISS series, you will know that it is very much a document-centric architecture. The documents themselves, in all their presentation-entangled, semi-structured glory, are treated as the primary content. We create folders, and folders inside folders. We create documents with headings and headings inside headings and we put these into folders. We then blur the distinction between folder navigation (inter-document) and heading "outline" navigation (intra--document) so that the whole corpus can be conceptualized as a single hierarchical information store. The entire state of a legislature/parliament, is in KLISS, *itself* a document – albeit a very large one! Simply put, KLISS does not care about the distinction between a folder and a heading. They are both simply hierarchical container constructs.
In KLISS a "view" is simply a time-based snapshot generated from the enormous document that is the repository, seen at a point in time, in some required format. So, a PDF of a bill is such a snapshot view. So too is a the HTML page of a committee report, a journal, a corpus of promulgated law etc. HTML, PDF, CSV, there are all the same in the KLISS information model. They are just views, taken at a point in time, out of the corpus as a whole.
Earlier in this series I talked about how the web blurs the distinction between naming something to pick it out and performing a query to pick it out. KLISS takes advantage of that blurring in the creation of views. So much so that a consumer of a KLISS URI cannot tell if the resource being picked out is "really there" or the result of running a query against the repository.
The hierarchical information model in KLISS has been strongly influenced by Hebert Simon and his essay The Architecture of Complexity. The view/query model is a sort of mashup of ideas from Bertrand Russell (proper nouns as query expressions) and John Kripke (rigid designators) combined with the Web Architecture of Sir Tim Berners Lee.
The most trivial views over the KLISS repository are those that correspond to real bytes-on-the-disk documents. Bills are generally like that. So too are votes. So too are sections of statute. Another level of views are those generated dynamically by assembling documents into larger documents. Volumes of statute are like that. Journals are like that. Once assembled, these documents often go back into the repository as real bytes-on-the-disk documents. This creates a permanent record of the result of the assembly process but it also allows the assemblies to be, themselves part of further assemblies. Permanent journals are like that. Final calendars are like that. Chronologies of statutes are like that.
Yet another level of views are those generated from the KLISS meta-data model...In KLISS, any document in the system can have any number of property/value pairs associated with it. When transactions are stored in the repository, these property/value pairs are loaded into a relational database behind the scenes. This relational database is used by the query subsystem to provide fast, ordered views over the repository. The sort of queries enabled are things like:
Give me all the bill amendments tabled between dates X and Y
Give me all the sponsors for all bills referred to the Agriculture committee last session
Give me all bills with the word "consolidation" in their long titles
How many enrolled bills have we so far this session?
etc.
At this point I need to point out that although we use a relational database as the meta-data indexer/query engine in KLISS, we do not use it relationally. This is by design. At this core level of the persistence model, we are not modeling relationships *between* documents. Other levels provide that function (we will get to them later on.). Effectively what we do is utilize a Star schema in which (URI+Revision Number) is the key used to join together all the metadata key, value pairs. The tabular structure of the meta-data fields is achieved via a meta-modeling trick in which the syntax of the field name, indicates what table and what field and what field type should be used for the associated value. In the future, we expect that we will gravitate away from relational back-ends into more non relational stores that are thankfully, finally, beginning to become commonplace.
It is important to note that in KLISS, the meta-data database is not a normative source of information. The master copy of all data is, at all times in the documents themselves. The metadata is stored in the documents themselves (the topic of an upcoming post). The database is constructed from the documents in order to serve search and retrieval needs. That is all. In fact, the database can be blown away and simply re-created by replaying the transactions from the KLISS time machine. I sometimes explain it by saying we use a database in the same way that a music collection application might use a database. Its purpose is to facilitate rapid slicing/dicing/viewing via meta-data.
This brings me to the most important point about how information is organized in KLISS. Lets step all the way back for a moment. Why do us humans organize stuff at all? We organize in order to find it again. In other words, organization is not the point of organization. Retrieval is the point of organization. Organization is something we do now, in anticipation of facilitating retrieval in the future. For most of human history, this has meant creating an organizational structure and packing stuff physically into that structure. Shoe closets, cities, pockets, airplanes, filing cabinets, filo-faxes, bookshelves, dewey decimal classification...
As David Weinberger explains in his book "Everything is Miscellaneous", there is no need for a single organizational structure for electronic information. A digital book does not need exactly one shelf on one wall, classified under one dominant heading. It can be on many shelfs, on may walls under many headings, in many ontologies, all at the same time. In fact, it can be exploded into pieces, mashed up with other books and represented in any order, in any format, any where and any time. Not only is this possible thanks to IT, it cannot be stopped. All known attempts – and their have been numerous – since the dawn of IT have failed to put the organization genie back in the bottle...
Having said that, the tyranny of the dominant decomposition appears, per Herbert Simon to be woven into the fabric of the universe. In order to store information – even electronically - we must *pick* at least some organizational structure to get us started. At the very least, things need to have names right? Ok. What form will those names take...Ten minutes into that train of thought and you have a decomposition on your hands. So what decomposition will be pick for our legislative/parliamentary materials? Do committees contain bills or do bills contain committees? Is a joint committee part of the house data model or part of the senate data model or both? Are bill drafts stored with the sponsor or with the drafter? Are committee reports part of the committee that created them or part of the bills they modify? etc. etc...One hour later, you are in a mereotopology induced coma. You keep searching for the perfect decomposition. If you are in luck, you conclude that there is no such thing as the perfect decomposition and you get on with your life. If you are unlucky, you get drafted into a committee that has to decide on the correct decomposition.
Fact of life: If there are N people in a group tasked with deciding an information model, there are exactly N, mutually incompatible models vying for dominance and each of the N participants is convinced that the other N-1 models are less correct than their own. Legislatures/parliaments provide and excellent example of this phenomenon. Fill a room with drafting attorneys, bill status clerks, journal clerks, committee secretaries, fiscal analysts and ask each of them to white-board their model of, for example bills, you will get as many models as there are people in the room.
That is why, in KLISS, by design, the information model – how it carves up into documents versus folders, paragraphs versus meta-data fields, queries versus bytes-on-the-disk does not really matter. Just pick one! There are many, many models that can work. Given a set of models that will work, there is generally no compelling reason to pick any particular one. In legislatures/parliaments – as in many other content-centric applications the word "correct" needs a pragmatic definition. In KLISS, we consider an information model to be "correct" if it supports the efficient, secure production of the required outputs with the required speed of production. That is essentially it. Everything else is secondary and much of it is just mereotopology.
Two more quick things before I wrap up for today. You may be thinking, "how can a single folder structure hope to meet the divergent needs of all the different stakeholders who likely have different models in their head for how the information should be structured?" The way KLISS does it is that we create synthetic folder structures – known as "virtual views" – over the physical folder structure. That allows us to create the illusion – on a role by role basis – that each group's preferred structure is the one the system uses :-)
As well as helping to create familiar folder structures on a role-by-role basis, virtual views also allow us to implement role based access control. Every role in the system uses a virtual view. Moreover, all event notifications use the virtual views and all attempted access to assets in the repository are filtered through the users virtual view - that includes all search results.
To sum up...KLISS uses a virtualized hierarchical information model combined with property/value pairs arranged in a star-schema fashion. Properties are indexed for fast retrieval and based on scalar data types that we leverage for query operators e.g. date expression evaluation, comparisons of money amounts etc. The metadata model is revision based and the repository transaction semantics guarantee that the metadata view is up to date with respect to the time machine view at all times. All event notifications use the virtual view names for assets.
You may be wondering, "is it possible to have a document with no content other than metadata?". The answer is "yes". That is exactly how we reify non-document concepts like committees, members, roles etc. into document form for storage in the time machine. Yes, in KLISS, *everything* is a document:-)
Next up: Data models, data organization and why the search for the "correct" model is doomed.
Thursday, June 24, 2010
KLISS: The Eventing Model and the Consistency Model
Last time in this KLISS series (and the time before that), I concentrated on the concept of names for information assets. This, seemingly peripheral concern is, in my view, critical in legislative informatics. I talked about how a well thought-out set of names, sitting on top of a "time machine" oriented persistence substrate, helps dramatically to meet many needs in legislatures/parliaments including rigorous citation and rigorous transparency audit-trail. Happily, a name-oriented focus sits very nicely on top of the world wide web architecture and, in particular, sits nicely with RESTian system architectures. (If REST is new to you, you might be interested in starting with this article I wrote some years ago and the resources referenced at the end.)
In this installment on KLISS, I want to turn to the closely related concept of events and how it fits in with the time machine model in the KLISS architecture. When I say KLISS is a time machine I do not mean that KLISS sits there, recording what is happening in real time 24x7. The reason being, that for long tracts of time, nothing actually happens because nothing is going on inside the "black box" that is the legislature/parliament. When I say nothing is going on, I mean that nobody is doing anything. There are no actors acting inside the black box. Therefore, there is nothing to record into the time machine. We call this a quiescent system state. We are like Beckett, waiting for Godot...
Now, as soon as somebody *acts*, the time machine persistence layer captures the act itself as a transaction against the time machine. The act could be the introduction of a bill, the explanation of a vote, a point of personal priviledge to be recorded in the journal, an update to the statute, a referral of a bill to a conference committee etc. Such acts rarely – if ever – stand alone. Picture an atom smasher. One "event" comes in and bang, many secondary events are triggered. These secondary events may trigger tertiary events and so on. Eventually, if there are no new primary events, the system quiesceses again.
To ground what follows in a practical scenario, consider what happens when a new bill is introduced. Here is a representative series of events that might occur...
I am greatly generalizing and simplifying here, but I hope you can see how one event leads to a set of secondary events and how each of those secondary events may themselves produce more events.
At an IT architecture level, two main questions arise. Firstly, how do we arrange that all the interested entities get informed of the existence of a new event? Secondly, how do we arrange that the "views" of the state of the legislature are kept consistent across the various information assets that record the events? i.e. the the bill status system, the pdf of the bill, the HTML of the bill, the pdf of the journal. KLISS achieves both using an asynchronous XML-based messaging backbone. Every time the time machine is changed – an event notification is sent out and all interested sub-systems have the chance to act as they wish. Any acts taken can themselves trigger *further events* perhaps involving further transactions against the time machine.
This takes place asynchronously. That is very important and I'd like to explain it as it is critical to the model. Classical "database-think" operates on ACID principles. This is problematic in legislatures/parliaments (in fact, I believe it is problematic in most document-centric domains) because to achieve overall information "consistency" I need to update bill status pages, generate PDFs of journals, convert Bills to HTML and post them on websites, update the search indexes, push out the twitter updates etc. There is simply too much to do for me to be able to lock the entire repository, update everything and then free the lock. Even if I could, it would create significant temporal coupling between sub-systems. Temporal coupling is, in general, bad news. What if one of my sub-systems (say PDF generation) is running slow because of load or maybe offline because of a fault? I cannot afford to wait around for it to become available. I cannot fail the transaction simply because some sub-system is not in a consistent state with respect to the rest of the sub-systems. What to do?
Remember when I talked about the time-machine repository and the fact that each change – each transaction – has a unique revision number. Remember how I mentioned that the URIs to retrieve content from the repository include the revision numbers? Well, every event sent out by KLISS includes the revision number of the transaction. That way, sub-systems that receive the event can look at the repository as it looked at that revision number. i.e. at the timestamp when the revision occurred. Think of an Automatic Teller Machine. You put in your card and ask for a balance on an account. Does the machine tell you the balance as it is *right now*. No. It tells you the balance as it stood the moment the query hit the ledgers. One millisecond later, a million dollars might have hit your account. Does that make the printout you got from the ATM incorrect? No because the printout stipulates the timestamp that the ledger query happened. Maybe it was two milliseconds ago.Maybe it was two years ago. It does not matter. The printout is correct because it is locked to a point in time by the timestamp printed on it.
KLISS works the same way, all bills, all journals, all bill statute pages, all aggregated publications, all hyperlinks...encode point-in-time information. All views of the time-machine basically say "I was run when the time-machine repository was at revision 1234. Everything you see on this page, is correct as of revision 1234..."
This is critical because it removes a whole slew of otherwise very thorny problems. For example, what happens if I generate a page that tells me what bills are in committee X by looking into the time machine folder where the bills are stored. What if, 1 millisecond later, somebody moves the bill out of that committee? It doesn't matter because the first thing we do when generating any view of the repository is to find out what the current revision number is. Lets say it is revision 1234. All subsequent queries against the repository pass that revision number in. The view itself then displays a footer saying "Correct as of revision 1234 15:43, 2010010".
This model has a variety of names. Some call it idempotency and that is certainly part of it. i.e. given a URI with a revision number, KLISS will always, always, always return the same stream of bytes. It will never change. It is a classic candidate for a GET operation that has no side-effects on the information corpus. I prefer to use Werner Vogel's term "Eventually Consistent" to describe the model. KLISS allows individual sub-systems to update their views of the repository at their own speed. If all the events quiesce and all sub-systems are operational, then the complete vista of "views" over the repository contained in all the sub-systems will, eventually also quiesce and be consistent with each other. During normal operation, it is to be expected that some sub-systems will be updated later than others but their views are never wrong – they are simply reflective of an older time-point. As well as Amazon's Werner Vogels, the writings of Pat Helland of Microsoft on this subject are worth reading. Bottom line. Time is relative. You cannot really lock it down. Certainly at web-scale, distributed, federated systems there is no alternative but to embrace the relativity of time and work with it rather than fight against it. That is what KLISS does.
One final point on temporal decoupling before I wrap up...KLISS uses both fire-and-forget and guaranteed-delivery messaging semantics. In English what that means is that a sub-system that may or may not be online, or may need to run slower than other sub-systems never looses track of where the time machine is at. Messages generated for its attention are queued up and can be drawn-down at as leisurely a pace as required. Sub-systems can be taken down for maintenance and spun back up. When they spin back up any messages that they missed, are sitting there queued up to be consumed whenever. This makes high availability of a system as large as KLISS significantly easier as there are very few reasons why the system would ever need to be off-line. Individual services may go off line but the core of KLISS itself, just keeps on trucking... I think of it as Reed's End-to-End Argument applied at the application level. KLISS puts as little "smart" stuff in the center of the architecture as possible, leaving most of the customer-facing "smart" stuff out at the edges.
By now, I hope you are beginning to see that we do not do content management in KLISS in the classical "static" model of simply storing stuff in a repository-of-the-now. In KLISS
One final point, the event-oriented model in KLISS can be usefully conceptualized in terms of a formalism known as Speech Acts. During analysis phases, I find it very useful to separate my illocutions from my perlocutions as it helps me see where secondary and indeed N-ary event cascades are likely to happen. If the concept of speech acts flips your switch or (aspirates your fricatives), you might be interested in this article on the subject.
Next up: Organizing legislative material in KLISS.
In this installment on KLISS, I want to turn to the closely related concept of events and how it fits in with the time machine model in the KLISS architecture. When I say KLISS is a time machine I do not mean that KLISS sits there, recording what is happening in real time 24x7. The reason being, that for long tracts of time, nothing actually happens because nothing is going on inside the "black box" that is the legislature/parliament. When I say nothing is going on, I mean that nobody is doing anything. There are no actors acting inside the black box. Therefore, there is nothing to record into the time machine. We call this a quiescent system state. We are like Beckett, waiting for Godot...
Now, as soon as somebody *acts*, the time machine persistence layer captures the act itself as a transaction against the time machine. The act could be the introduction of a bill, the explanation of a vote, a point of personal priviledge to be recorded in the journal, an update to the statute, a referral of a bill to a conference committee etc. Such acts rarely – if ever – stand alone. Picture an atom smasher. One "event" comes in and bang, many secondary events are triggered. These secondary events may trigger tertiary events and so on. Eventually, if there are no new primary events, the system quiesceses again.
To ground what follows in a practical scenario, consider what happens when a new bill is introduced. Here is a representative series of events that might occur...
- A member in a chamber (the sponsor) gets permission to speak through the chair and announces the bill.
- The relevant chamber clerk "calls" for the bill from legislative council/revisors office/bills office.
- The event is recorded in the journal.
- The new bill is allocated a new identifier and added to bill status.
- Prints of the bill are called for, for each Member.
- PDF (and possibly HTML) version are created and posted.
I am greatly generalizing and simplifying here, but I hope you can see how one event leads to a set of secondary events and how each of those secondary events may themselves produce more events.
At an IT architecture level, two main questions arise. Firstly, how do we arrange that all the interested entities get informed of the existence of a new event? Secondly, how do we arrange that the "views" of the state of the legislature are kept consistent across the various information assets that record the events? i.e. the the bill status system, the pdf of the bill, the HTML of the bill, the pdf of the journal. KLISS achieves both using an asynchronous XML-based messaging backbone. Every time the time machine is changed – an event notification is sent out and all interested sub-systems have the chance to act as they wish. Any acts taken can themselves trigger *further events* perhaps involving further transactions against the time machine.
This takes place asynchronously. That is very important and I'd like to explain it as it is critical to the model. Classical "database-think" operates on ACID principles. This is problematic in legislatures/parliaments (in fact, I believe it is problematic in most document-centric domains) because to achieve overall information "consistency" I need to update bill status pages, generate PDFs of journals, convert Bills to HTML and post them on websites, update the search indexes, push out the twitter updates etc. There is simply too much to do for me to be able to lock the entire repository, update everything and then free the lock. Even if I could, it would create significant temporal coupling between sub-systems. Temporal coupling is, in general, bad news. What if one of my sub-systems (say PDF generation) is running slow because of load or maybe offline because of a fault? I cannot afford to wait around for it to become available. I cannot fail the transaction simply because some sub-system is not in a consistent state with respect to the rest of the sub-systems. What to do?
Remember when I talked about the time-machine repository and the fact that each change – each transaction – has a unique revision number. Remember how I mentioned that the URIs to retrieve content from the repository include the revision numbers? Well, every event sent out by KLISS includes the revision number of the transaction. That way, sub-systems that receive the event can look at the repository as it looked at that revision number. i.e. at the timestamp when the revision occurred. Think of an Automatic Teller Machine. You put in your card and ask for a balance on an account. Does the machine tell you the balance as it is *right now*. No. It tells you the balance as it stood the moment the query hit the ledgers. One millisecond later, a million dollars might have hit your account. Does that make the printout you got from the ATM incorrect? No because the printout stipulates the timestamp that the ledger query happened. Maybe it was two milliseconds ago.Maybe it was two years ago. It does not matter. The printout is correct because it is locked to a point in time by the timestamp printed on it.
KLISS works the same way, all bills, all journals, all bill statute pages, all aggregated publications, all hyperlinks...encode point-in-time information. All views of the time-machine basically say "I was run when the time-machine repository was at revision 1234. Everything you see on this page, is correct as of revision 1234..."
This is critical because it removes a whole slew of otherwise very thorny problems. For example, what happens if I generate a page that tells me what bills are in committee X by looking into the time machine folder where the bills are stored. What if, 1 millisecond later, somebody moves the bill out of that committee? It doesn't matter because the first thing we do when generating any view of the repository is to find out what the current revision number is. Lets say it is revision 1234. All subsequent queries against the repository pass that revision number in. The view itself then displays a footer saying "Correct as of revision 1234 15:43, 2010010".
This model has a variety of names. Some call it idempotency and that is certainly part of it. i.e. given a URI with a revision number, KLISS will always, always, always return the same stream of bytes. It will never change. It is a classic candidate for a GET operation that has no side-effects on the information corpus. I prefer to use Werner Vogel's term "Eventually Consistent" to describe the model. KLISS allows individual sub-systems to update their views of the repository at their own speed. If all the events quiesce and all sub-systems are operational, then the complete vista of "views" over the repository contained in all the sub-systems will, eventually also quiesce and be consistent with each other. During normal operation, it is to be expected that some sub-systems will be updated later than others but their views are never wrong – they are simply reflective of an older time-point. As well as Amazon's Werner Vogels, the writings of Pat Helland of Microsoft on this subject are worth reading. Bottom line. Time is relative. You cannot really lock it down. Certainly at web-scale, distributed, federated systems there is no alternative but to embrace the relativity of time and work with it rather than fight against it. That is what KLISS does.
One final point on temporal decoupling before I wrap up...KLISS uses both fire-and-forget and guaranteed-delivery messaging semantics. In English what that means is that a sub-system that may or may not be online, or may need to run slower than other sub-systems never looses track of where the time machine is at. Messages generated for its attention are queued up and can be drawn-down at as leisurely a pace as required. Sub-systems can be taken down for maintenance and spun back up. When they spin back up any messages that they missed, are sitting there queued up to be consumed whenever. This makes high availability of a system as large as KLISS significantly easier as there are very few reasons why the system would ever need to be off-line. Individual services may go off line but the core of KLISS itself, just keeps on trucking... I think of it as Reed's End-to-End Argument applied at the application level. KLISS puts as little "smart" stuff in the center of the architecture as possible, leaving most of the customer-facing "smart" stuff out at the edges.
By now, I hope you are beginning to see that we do not do content management in KLISS in the classical "static" model of simply storing stuff in a repository-of-the-now. In KLISS
- All content conforms to an enterprise information model. It is not just documents in folders. KLISS represents system actors and workflows and roles and committees etc. as *documents*.
- All content is part of the time-machine model. In KLISS moving a bill into a committee or changing the phone number of the speaker Pro Tem is precisely the same repository operation as updating a piece of statute.
- Any changes to the information model timestamped and communicated via persistent, asynchronous messaging to all sub-systems which can then use that timestamp to lock down time for their own interactions with the repository.
One final point, the event-oriented model in KLISS can be usefully conceptualized in terms of a formalism known as Speech Acts. During analysis phases, I find it very useful to separate my illocutions from my perlocutions as it helps me see where secondary and indeed N-ary event cascades are likely to happen. If the concept of speech acts flips your switch or (aspirates your fricatives), you might be interested in this article on the subject.
Next up: Organizing legislative material in KLISS.
Subscribe to:
Posts (Atom)