Previously: What is law? - Part 6
Last time we ended
with the question : “Given
a corpus of law at time T, how can we determine what it all means?”
There is a real risk
of disappearing down a philosophical rabbit hole about how meaning is
encoded in the corpus of law. Now I really like that particular
rabbit hole but I propose that we not go down it here This whole area
is best perused, in my experience, with comfy chairs, time to kill
and a libation or two (semiotics, epistemolgy and mereotopology
anyone?).
Instead, we will
simply state that because the corpus of law is mostly written human
language it inherits some fascinating and deep issues to do with how
written text establishes shared meaning and move on. For our
purposes, we will imagine an infinitely patient person with infinite
stamina, armed with a normal adults grasp of English, who is going to
read the corpus and explain it back to us, so that we computer people
can turn it into something else inside a computer system. The goal of
that “something else” being to capture the meaning but be easier
to work with inside a computer than a big collection of
“unstructured” documents.
This little
conceptual trick of employing a fantastic human to read the current corpus and
explain it all back to us, allows us to split the problem of meaning
into two parts. The first part relates to how we could read it in
its current form and extract its meaning. The second part relates to
how we would encode the extracted meaning in something other than a
big collection of unstructured documents. Exploring this second question, will, I
believe, help us tease out the issues in determining meaning in the
corpus of law in general, without getting bogged down in trying to
get machines to understand the current format (lots and lots of unstructured documents!) right off
the bat.
I hope that makes
sense? Basically, we are going to skip over how we would parse it all
out of its current myriad document-form into a human brain and instead look at
how we would extract it from said brain and store it again – but
into something more useful than a big collection of documents.
Assuming we can find a representation that is good enough, the reading of the current corpus should
be a one-off exercise because as the corpus of law gets updated, we would update our bright shiny new digital representation of the corpus
and never have to re-process all the documents ever again.
So what options do
we have for this digital knowledge representation? Surely there is something better than just unstructured
document text? Text after all, is what you get if you use computers
as typewriters. Computers do also give us search, which is a
wonderful addition to typesetting, but understanding is a very
different thing again. In order to have machines understand the corpus of
law we need a way to represent the knowledge present in the law - not just what words are present (search) or how the words look on the page (formatting).
This is the point
where some of you are likely hoping/expecting that I am about to
suggest some wonderful combination of XML and Lisp or some such that
will fit the bill as a legal corpus knowledge representation
alternative to documents... It would be great if that were possible
but in my opinion, the textual/document-centric nature of a
significant part of the legal corpus is unavoidable for
reasons I will hopefully explain. Note that I said “significant
part”. There are absolutely components of the corpus that do not
have to be documents. In fact, some of the corpus has, already
transitioned out of documents but, if anything, this has actually
increased the interpretation complexities – of establishing meaning
- not reduced them. I will hopefully explain that too:-)
I think the best way
of explaining why I think some form of electronic documents is as
good as we can hope for, for large parts of the legal corpus, is to look at
the things that are not actually part of the corpus of documents at
all, but are key to how law actually works. It turns out that these
things cannot be put into a computer at all, in my opinion.
What are these
mystical things? There are two of them. The first I call the closed
world of knowledge (CWoK) and the second I call the Unbounded Opinion
Requirement (UOR) of law.
We will look at CwoK
and UOR in Part 8.
-->