Friday, January 08, 2016

More on hash codes as identifiers

I missed something very important in the recent post about using hash codes as identifiers. The hash-coding scheme make is possible to ask "the cloud" to return the byte-stream from wherever is the most convenient replica of the byte-stream, but it does something else too....

It allows an application to ask the cloud : "Hey. Can somebody give me the 87th 4k block of the digital object that hashes to X?"

This in turn means that an application can have a bunch of these data block requests going at any one time and data blocks can come back in any order from any number of potential sources.

So, P2P is alive and well and very likely to be embedded directly in the Web OS data management layer. See for example libp2p. Way, way down the stack of course, lies the TCP/UDP protocols which addresses similar things e.g. data packets that can be routed differently, may not arrive in order, may not arrive at all etc.

There does seem to be a pattern here...I have yet to find a way to put it into words but it does appear as if computing stacks have a tendency to provide overlapping feature sets.

How much of this duplication is down to accidents of history versus, say, universal truths about data communications, I do not know.

1 comment:

Sean McGrath said...

Oh, and one more thing....the concepts of accessing bytestreams based on their hashes is an old one. I remember learning about content-adressable storage while a comp. sci. undergrad in TCD in the early Eighties.