Featured Post

Linkedin

 These days, I mostly post my tech musings on Linkedin.  https://www.linkedin.com/in/seanmcgrath/

Wednesday, January 06, 2016

The biggest IT changes in the last 5 years - Hash-Handled-Heisenfiles

I have taken to using a portmanteau phrase "Hash-Handled-Heisenfiles" to try to capture a web-centric phenomenon that appears to be changing one of the longest-standing metaphors in computing. Namely, the desktop concept of a "file".

In the original web, objects had the concept of "location" and this concept of location was very much tied to the concept of the objects "name".

Simply put, if I write "http://tumboliawinery.ie/stock.html", I am strongly suggesting a geographic location (Ireland from the ".ie"), an enterprise in that geography "Tumbolia Winery", and finally a digital object that can be accessed there "stock.html"

Along with the javascript-ification of everything, referenced in the last post, schemes for naming and locating digital objects are increasingly not based on the (RESTian) concepts underpinning the original Web.

One one end of the spectrum you have the well established concept of UID or GUID as used in Relational Databases , Lotus Notes etc. These identifiers, by design are semantics-free. In other words, if you want to get insight into the object itself, what it means or what it is, you get the object via its opaque identifier and then look at its attributes. You can think of it as a faceted classification system of identity. Any attribute or combination of attributes from the object can serve as a form of name. Given enough attributes, the identifier gradually becomes unique - picking out a single object, as opposed to a set of objects. Another way to look at this is that in relational database paradigms, all identifiers that carry semantics are actually queries in disguise. (This area: naming things. Is one of my, um, fixations.)

This is an old phenomenon in Web terms on the server side. Ever since the days of cgi-gateway scripts, developers have been intercepting URLs and mapping them into queries, running behind the firewall, talking SQL-speak to the relational database.

Well, this appears to be changing in that there is an alternative, non-relational notion of identifier that appears to gaining a lot of traction. Namely, the idea of using the hashcode of a digital object as its opaque identifier. Why? Well, because once you do that, the opaque identifier can be independent of location. It could be anywhere. In fact - and this is key bit - it can be in many places at once. Hence Heisenfiles as a tip-o-the-hat to Heisenberg.

Your browser no longer needs to necessarily go to tumboliawinery.ie to get the stock.html object. Instead it can pick it up from wherever by basically saying "Hey. Has anybody out there got an object that hashes to X?".

I think this is a profound change. Why now? I think it is a combination of things. HTML5 Browsers and local storage. Identifiers disappearing into the Javascript and out of URL space. The bizarre-but-powerful concept of hosting a web-server inside the client-side browser The growing interest in all-things-blockchain, in particular smart contracts and Dapps.

All these things I think hint at a future where "file" and "location" are very distinct concepts and identifiers for file-like-objects are hash-values. Interesting times.








No comments: