Friday, January 15, 2016

The biggest IT changes in the last 5 years : dynamically typed data

Back in 2010 I believe it was, I started writing bits-and-pieces about NoSQL and I remember a significant amount of push-back from  RDB folks at the time.

Since then, I think it is fair to say that there has been a lot more activity in tools/techniques for unstructured/semi-structured/streamed data than for relational data.

The word "unstructured" is a harsh word:-) Even if the only thing you know about a file is that it contains bytes, you *have* a data structure. Your data structure is the trivial one called "a stream of bytes". For some operations, such as data schlepping, that is all you need to know. For other operations, you will need a lot more.

The fact that different conceptualizations of the data are applicable for different operations is an old, old idea but one that is, I think, becoming more and more useful in the distributed, mashup-centric data world we increasingly live in. Jackson Structured Programming for example, is based on the idea that for any given operation you can conceptualize the input data as a "structure" using formal grammar-type concepts. Two different tasks on exactly the same data may have utterly different grammars and that is fine.

The XML world has, over time, developed the same type of idea. Some very seasoned practitioners of my acquaintance make a point of never explicitly tying an XML data set to a single schema. Preferring to create schemas as part of the data processing itself. Schemas that model the data in  a way that suites the task at hand.

I think there is an important pattern in there. I call it Dynamically Typed Data. Maybe there is an existing phrase for it that I don't know:-)

"Yes, but", I hear the voices say, "surely it is important to have a *base* schema that describes the data 'at rest'?"

I oscillate on that one:-) In the same way that I oscillate on the idea that static type checking is just one more unit test on dynamically typed code.

More thinking required.


Wednesday, January 13, 2016

The biggest IT changes in the last 5 years - Github

Github is a big, big deal. I don't think it is a big deal just because it is underpinned by Git (and thus "better" - by sheer popularity - than Mercurial/DARCs etc).

I think it is a big deal because the developers of Github realized that developers are a social network. Github have done the social aspects of coding so much better than the other offerings from Google, SourceForge etc. Social computing seems to gravitate towards a small number of thematic nodes: family - facebook, business - linkedin, musicians - bandcamp.  In the same vein : Coders - Github.

Git concepts like pull requests etc. certainly help to enable linkages between developers but it is github that gives all those dev-social interconnections a place to hang out in cyberspace.

Tuesday, January 12, 2016

The biggest IT changes in the last 5 years - Quicksand OS

Back around Windows 7 there was a big change in the concept of an Operating System version. For many years prior to Windows 7, the Windows world at large had a concept of operating system that involved periods of OS quiescence that might last for years, punctuated by "Service Packs". Periodic CD-ROM releases with big accumulations of fixes.

Many developers in the Windows ecosystem remember the days of "XP Service Pack 3" which stood out as one of those punctuation marks. "Lets start with a clean XP SP 3 and go from there...."

Roll the clock forward to today and we have a more realtime environment for updates/upgrades. Every day (or so it seems to me!) my Windows machine, my iPad, my Android Phone and my Ubuntu machine either announce the availability of updates/upgrades or announce that they have been installed for me overnight.

Although the concept of OS version numbers still exists, as any developer/tester will tell you, strange things can happen as a result of the constant "upgrade" activity. Especially given that the browser environments these days are so closely twinned to the operating system that an upgrade to a browser can be tantamount to the installation of an old-school service pack.

This has resulted in a big jump in the complexity of application testing as it is becoming increasingly impossible to "lock down" a client-side configuration to test against.

This is a big problem for internal IT teams in enterprises. One that has not got a good solution that I can see. Clearly, true upgrades are good things but it is terribly hard to be sure that an "upgrade" does not introduce collateral damage to installed applications.

Also, upgrades that are very desirable - such as security fixes - can get bundled with upgrades that you would prefer to defer, creating a catch 22.

My sense is that this problem is contained rather than tamed at present because of the wide-spread use of browser front-ends. I.e. server-side dev-ops teams lock thing down as best they can and control the update cycles as best they can while living with the reality that they cannot lock down the client side.

However, as the "thin" client-side becomes "thick", this containment strategy becomes harder to implement.

Locking down client-OS images helps to a degree but the OS vendor strategies of constant updates do not sit well with a lock-down strategy. Plus, BYOD and VPN connections etc. limit what you can lock-down anyway.

Monday, January 11, 2016

The biggest IT changes in the last 5 years - Client/Server-based Standalone Thick Clients

Yes, I know it sounds a bit oxymoronic. Either an application has a server component - distinct from the client - or it doesn't. How could an application design be both client/server and standalone thick client?

By embedding the server side component, inside the client side, you can essentially compile-away the network layer. Inside of course, all the code operates on the basis of client/server communications paradigms but all of that happens without the classic distributed computing fallacies potentially biting you. Ultimately everything network-ish is on the loopback interface.

I like this style much more than I like its mirror image, which is to design with thick client paradigms and then insert a network layer "transparently" by making the procedure calls turn into remote procedure calls.

The problems with the latter are that all the distributed computing fallacies hold and without designing for them, your application is ill-equipped to cope with them.

If we are swinging back towards a more client-side-centric UI model and I believe we are, doing it with things like, say, electron rather than going back to traditional, say Win32 + DCOM/CORBA/J2EE, makes sense now that we are all (mostly!) well familiar with the impossibilty of wishing away the network:-)