Wednesday, September 21, 2011

Visual Tables and Meaning

John Lewis commented on my post about the thorny issue of visual tables and content semantics. John goes on to ask if the tables can be removed from legal or para-legal documents somehow?

I do not think so unfortunately. The inter-weaving of pure content and semantics are too deep. Douglass Hofstadter's article in Scientific American about Knuth's Meta-Font system is a great examination of how deep this problem really is. (The article is re-printed in Hofstadter's book Metamagical Themas. It does not appear to be online.) Hofstadter coined the term Ambigram to show how even simple typographic constructs can lead to interesting semantic ambiguities. I have seen some accidental ambigrams in legal texts over the years:-)

The field of mathematics has long struggled with this in its search for an executable representation of mathematical constructs. Some notations are just so visual its difficult to see how there could ever be a useful separation made between their content and the presentation. For example, Penrose's Tensor Diagram Notation. In extreme cases, the presentation is the content. No wonder that TeX remains the weapon of choice for mathematicians :-)

And yet, the legal world manages to survive the ambiguities and contradictions in its corpus. How? Via what is known in semiotics as a dynamical intrepretant known as the Judiciary :-) It is a beautifully simple idea. If there is a doubt as to the meaning of a text, the Judiciary tells you want it means. The explications provided are then themselves captured in textual form known as case law and the case law becomes legally powerful thanks to stare decisis.

An analogy from software development is unit testing. The code is the code is the code but the true meaning of the code? The unit tests tell you that. The code “means” what the unit tests tell you it means. All else is just syntax. Case law is a bit like a unit test suite.

Is it possible to remove the tables completely from legal/para-legal documents? No the meaning is just too subtly inter-twined with the presentation. It is possible to remove the need for unit tests in software development? No, the meaning of source code is impossible to separate from its interpretant – an execution environment. A great way to see this is to look at static analysis tools and realize what it is about your code that static analysis can never tell you. Arguable the limitations of static textual analysis were established by the great Alan Turing back in 1936 with the the halting problem.

So, if that tables cannot come out, what to do? I believe the most promising approach is to use the interpretant and stare decisis to remove as much ambiguity as possible. i.e. legally/socially binding exposition on what parts of tabular material contribute to meaning and what parts do not. That way, computer system designers like me would have guidance as to what needs to retained and what doesn't. Examples are things like fixed widths, tab leaders, vertical character alignments etc.

I honestly do not think it is possible to completely separate typesetting attributes into a nice binary "keep/optional" split but we won't know until we get the strare decisis process kicked off and let the interpretants in the Judiciary do their thing.

I know of no jurisdiction that has attempted to grapple with this issue to date but it is becoming more and more pressing as the need for digital "authentic" legal materials grows and grows.

1 comment:

Lewis John said...

Hi Sean,

Another well documented and extremely helpful insight into something that one with a lack of knowledge would consider as a trivial task. However which actually has remained as an extremely difficult problem to address, and address well as part of a larger modelling approach to moving legislation towards an XML-based structure. Unfortunately for myself, and as I feared, some of the points you make do not exactly make my life any easier, in fact quite the contrary! The Scottish Technical Standards I was referring to seem to (and quite overwhelmingly) contain a multitude of such extremely complex diagrams and tables as you mention which in turn include all of the aforementioned fixed widths, tab leaders, vertical character alignments etc.

Changing my point of view here, I would argue then that the knock-on effect of your primary arguement, which is that we cannot simply decouple content from presentation, text within tables from rendering, etc, is that people inaccurately infer or inconsistently interpret information from technically complex and inherently difficult to navigate documents such as legal texts and regulatory standards. There is no easy method, which we are currently able to quantify, which enables us to measure the consistency of decision making at every level whether it be local, state or wider a field.

Surely if we cannot measure consistency across even a small decision making process of a larger work flow then we cannot maintain this outlook that there is integrity within the planning, judicial, legal, or any other domains which are heavily reliant upon legal documentation bearing all the problems you highlight. It would seem to me that in the short-term, the importance of old tools such as stare decisis cannot and should not been underestimated. However, unfortunately this method of 'quality control', in my own experience, rarely happens outside the legal domain!

Thanks again for the posts.