Thursday, September 22, 2011

A coming epidemic of disabled programmers?

I am very glad to see that RSI (Repetitive Strain Injury) is the subject of a talk at PyCon Ireland.

The world is full of programmers in the 20-30 year old range, regularly pulling 18 hour coding sessions without giving it a second thought. Only taking breaks for restroom use, eating at their desks...

You can do that in your twenties and thirties but chances are, when you get into your forties, your hands and arms are going to start complaining.

As the saying goes, "I was that soldier". But, at least in my case I was mostly working with desktop machines and large-ish keyboards in my crazy coder days.

Compare today's high-octane coders. Weapon of choice is a laptop, often rested on the knees or propped precariously on a table along with 12 other laptops, hunched shoulders, knotted brows, bad light...

Not good. It *will* catch up with you. Speaking from personal experience, there is nothing more depressing that wanting to code and not being able to because you hands/arms are screaming at you. At one stage, I was reduced to tapping keys with the eraser tip of a pencil in order to get my e-mail.

Don't do what I did. Pay attention to RSI risks in your twenties/thirties and with luck you will never be visited with any problems.

Wednesday, September 21, 2011

Visual Tables and Meaning

John Lewis commented on my post about the thorny issue of visual tables and content semantics. John goes on to ask if the tables can be removed from legal or para-legal documents somehow?

I do not think so unfortunately. The inter-weaving of pure content and semantics are too deep. Douglass Hofstadter's article in Scientific American about Knuth's Meta-Font system is a great examination of how deep this problem really is. (The article is re-printed in Hofstadter's book Metamagical Themas. It does not appear to be online.) Hofstadter coined the term Ambigram to show how even simple typographic constructs can lead to interesting semantic ambiguities. I have seen some accidental ambigrams in legal texts over the years:-)

The field of mathematics has long struggled with this in its search for an executable representation of mathematical constructs. Some notations are just so visual its difficult to see how there could ever be a useful separation made between their content and the presentation. For example, Penrose's Tensor Diagram Notation. In extreme cases, the presentation is the content. No wonder that TeX remains the weapon of choice for mathematicians :-)

And yet, the legal world manages to survive the ambiguities and contradictions in its corpus. How? Via what is known in semiotics as a dynamical intrepretant known as the Judiciary :-) It is a beautifully simple idea. If there is a doubt as to the meaning of a text, the Judiciary tells you want it means. The explications provided are then themselves captured in textual form known as case law and the case law becomes legally powerful thanks to stare decisis.

An analogy from software development is unit testing. The code is the code is the code but the true meaning of the code? The unit tests tell you that. The code “means” what the unit tests tell you it means. All else is just syntax. Case law is a bit like a unit test suite.

Is it possible to remove the tables completely from legal/para-legal documents? No the meaning is just too subtly inter-twined with the presentation. It is possible to remove the need for unit tests in software development? No, the meaning of source code is impossible to separate from its interpretant – an execution environment. A great way to see this is to look at static analysis tools and realize what it is about your code that static analysis can never tell you. Arguable the limitations of static textual analysis were established by the great Alan Turing back in 1936 with the the halting problem.

So, if that tables cannot come out, what to do? I believe the most promising approach is to use the interpretant and stare decisis to remove as much ambiguity as possible. i.e. legally/socially binding exposition on what parts of tabular material contribute to meaning and what parts do not. That way, computer system designers like me would have guidance as to what needs to retained and what doesn't. Examples are things like fixed widths, tab leaders, vertical character alignments etc.

I honestly do not think it is possible to completely separate typesetting attributes into a nice binary "keep/optional" split but we won't know until we get the strare decisis process kicked off and let the interpretants in the Judiciary do their thing.

I know of no jurisdiction that has attempted to grapple with this issue to date but it is becoming more and more pressing as the need for digital "authentic" legal materials grows and grows.