Sean McGrath

Friday, October 14, 2011

Boooookkkkkkksssssssss

Having a soft spot for paper books may seem odd for somebody in my line of work but I absolutely *love* books - paper books.

Join that with an interest in legal publishing and, well, this pic from inside the LoC is like a picture of Disneyland to a child.

I had the great fortune to get a tour of the law library in the LoC a while back and it is absolutely stunning place for a book/law nerd like me to visit. Amazing.

Thursday, October 13, 2011

Denis Ritchie RIP

Wow. Today I learned that Denis Ritchie has passed away. Very sad.

I met him once at an Unix conference in London when the Plan 9 operating system was being evangalized as the logical successor to Unix.

I still have my copy of "K and R" - the book properly known as the C programming language book written by Kernighan and Ritchie. A true classic and, as far as I know, the worlds first "hello world" programming example.

Thursday, October 06, 2011

Steve Jobs remembered

My first paying job in IT was to implement a stock control system in Visicalc on an Apple ][ running the CP/M operating system. One of these. A bit later I worked on a Lisa and then a Fat Mac.

It was the Mac that started my love affair with text processing thanks to the Apple ImageWriter printer and then the amazing Apple LaserWriter.

One of my final year projects in Computer Science in Trinity College Dublin (1987) was a 3D wireframe teapot. I created it by sending postscript directly to an Apple Laserwriter over its printer cable, using an Apple 2c as a terminal.

From the Apple II to the iPad. What an amazing progression.

RIP Steve Jobs

Thursday, September 22, 2011

A coming epidemic of disabled programmers?

I am very glad to see that RSI (Repetitive Strain Injury) is the subject of a talk at PyCon Ireland.

The world is full of programmers in the 20-30 year old range, regularly pulling 18 hour coding sessions without giving it a second thought. Only taking breaks for restroom use, eating at their desks...

You can do that in your twenties and thirties but chances are, when you get into your forties, your hands and arms are going to start complaining.

As the saying goes, "I was that soldier". But, at least in my case I was mostly working with desktop machines and large-ish keyboards in my crazy coder days.

Compare today's high-octane coders. Weapon of choice is a laptop, often rested on the knees or propped precariously on a table along with 12 other laptops, hunched shoulders, knotted brows, bad light...

Not good. It *will* catch up with you. Speaking from personal experience, there is nothing more depressing that wanting to code and not being able to because you hands/arms are screaming at you. At one stage, I was reduced to tapping keys with the eraser tip of a pencil in order to get my e-mail.

Don't do what I did. Pay attention to RSI risks in your twenties/thirties and with luck you will never be visited with any problems.

Wednesday, September 21, 2011

Visual Tables and Meaning

John Lewis commented on my post about the thorny issue of visual tables and content semantics. John goes on to ask if the tables can be removed from legal or para-legal documents somehow?

I do not think so unfortunately. The inter-weaving of pure content and semantics are too deep. Douglass Hofstadter's article in Scientific American about Knuth's Meta-Font system is a great examination of how deep this problem really is. (The article is re-printed in Hofstadter's book Metamagical Themas. It does not appear to be online.) Hofstadter coined the term Ambigram to show how even simple typographic constructs can lead to interesting semantic ambiguities. I have seen some accidental ambigrams in legal texts over the years:-)

The field of mathematics has long struggled with this in its search for an executable representation of mathematical constructs. Some notations are just so visual its difficult to see how there could ever be a useful separation made between their content and the presentation. For example, Penrose's Tensor Diagram Notation. In extreme cases, the presentation is the content. No wonder that TeX remains the weapon of choice for mathematicians :-)

And yet, the legal world manages to survive the ambiguities and contradictions in its corpus. How? Via what is known in semiotics as a dynamical intrepretant known as the Judiciary :-) It is a beautifully simple idea. If there is a doubt as to the meaning of a text, the Judiciary tells you want it means. The explications provided are then themselves captured in textual form known as case law and the case law becomes legally powerful thanks to stare decisis.

An analogy from software development is unit testing. The code is the code is the code but the true meaning of the code? The unit tests tell you that. The code “means” what the unit tests tell you it means. All else is just syntax. Case law is a bit like a unit test suite.

Is it possible to remove the tables completely from legal/para-legal documents? No the meaning is just too subtly inter-twined with the presentation. It is possible to remove the need for unit tests in software development? No, the meaning of source code is impossible to separate from its interpretant – an execution environment. A great way to see this is to look at static analysis tools and realize what it is about your code that static analysis can never tell you. Arguable the limitations of static textual analysis were established by the great Alan Turing back in 1936 with the the halting problem.

So, if that tables cannot come out, what to do? I believe the most promising approach is to use the interpretant and stare decisis to remove as much ambiguity as possible. i.e. legally/socially binding exposition on what parts of tabular material contribute to meaning and what parts do not. That way, computer system designers like me would have guidance as to what needs to retained and what doesn't. Examples are things like fixed widths, tab leaders, vertical character alignments etc.

I honestly do not think it is possible to completely separate typesetting attributes into a nice binary "keep/optional" split but we won't know until we get the strare decisis process kicked off and let the interpretants in the Judiciary do their thing.

I know of no jurisdiction that has attempted to grapple with this issue to date but it is becoming more and more pressing as the need for digital "authentic" legal materials grows and grows.

Friday, September 02, 2011

On URIs and URNs: Every problem can be solved with another level of indirection...

John Sheridan is pondering URIs and URNs. I have pondered that a lot too. The idea of having another degree of independence between names (e.g. cites to legislation) and actual dereferencable identifiers makes a lot of sense of course. We don't want to tie ourselves down to implementations or platforms or server hosts if we can avoid it. Especially if the goals is to have very long lived identifiers.

However, a few things worry me about the standard "lets use URNs" reaction to long term identifiers.

1) URL's are already completely and utterly devoid of any direct connection to the underlying assets they point to. The number of levels of indirection present in your average resolution of a URL to a stream of bytes in RAM is already very large and many of them are under our control. I.e. we can change the mappings at will. The days of "static" IP addresses are long gone. So, the notion that URNs help because you can change the resolution process without touching the assets themselves doesn't sit well with me because I can do that with plain old URLs...At many levels from DNS to VLANs to NATting to HOSTS files to http redirects etc. etc.

Given the plethora of mappings already present in the (URL->Resource Representation) resolution process, do we gain much adding another one in the form of a URN mapping?

2) URN schemes need resolvers and in many systems I have seen that use URNs, representations get served up with embedded hyperlinks. The embedded hyperlinks often use URLs to access the URN resolver. I.e. http://.../resolve_urn?urn=foo. But of course, in order to do that, the representation ends up creating a dependency on the URL that accesses the resolver :-) If I save that asset, I now have a rendering that is dependent on the URL resolve - despite the presence of the URNs in the asset. So, have I gained anything?

3) It is true of course that domains are rented not owned and this makes folks uneasy about long term reliance on a namespace that is not fully under their control. However, the world is now so utterly dependent on DNS that a fair amount of caselaw exists to protect entities against cybersquats and loss of access to DNS rental rights. Plus standing up your own DNS inside a firewalled environment is straightfoward. Plus creating local mappings in a hosts file is very staightforward. And so on. Lots of options if you need to take control of the resolution process and re-map it.

4) Finally, a non-technical argument that also plays into my skepticism about URNs. They have been around forever. So have identifier schemes like DOI and SSN etc. The internet seems to have voted with its feet already and subsumed all these into URL based resolvers of various kinds. Witness the recent explosion in link shorterners. They map a URL to another URL...I just don't see market pressure out there for a different way to control de-referencing on the Internet.

All in all, with all the mappings already present and the malleability/configurabilty of same, I don't see the compelling rationale for adding other one in the form of URNs.

What am I missing?

Thursday, August 18, 2011

Looking forward to the GIS-Pro event

I will be doing the closing keynote at the URISA GIS-Pro event. I enjoy closing keynotes when - like this event - I get a chance to attend the whole event. That way I get to Zeit the Geist, so to speak, in my talk.