Thursday, January 03, 2019

An alternative model of computer programming : Part 2


This is part two of a series of posts about an alternative model of computer programming I have been mulling for, oh decades now. The first part is here: http://seanmcgrath.blogspot.com/2018/08/an-alternative-model-of-computer.html

The dominant conceptual model of computer programming is that it is computation, which in turn of course is a branch of mathematics. This is incredibly persuasive on many levels. George Boole's book An Investigation of The Laws of Thought, sets out a powerful way of thinking about truth/falsity and conditional reasoning in ways that are purely numerical and thus mathematical and, well, really beautiful. Hence the phrase “boolean logic”. Further back in time still, we find al-Khwārizmī in the Ninth century working out sequences of mathematical steps to perform complex calculations. Hence the word “algorithm”. Further back in the time of the ancient Greeks we find Euclid and Eratosthenes with their elegant algorithms for finding greatest common divisors and prime numbers respectively.

Pretty much every programming language on the planet has a suite of examples/demos that include these classic algorithms turned into math-like language. They all feature the “three musketeers” of most computer programming. Namely, assignment (e.g. y = f(x)), conditional logic (e.g. “if y greater than 0 do THIS otherwise THAT”) and branching (e.g. “goto END”).

These three concepts get dressed up in all sorts of fine clothes in different programming languages but, as Alan Turing showed in the Nineteen Thirties, you only need to be able to assign values and to “branch on 0” in order to be able to compute anything that is computable via a classical computer – a so called Turning Machine. (This is significantly less that everything you might want to compute but that is another topic for another day. For now, we will stick to classical computers as exemplified in the so-called Von Neumann Architecture and leave quantum computing for another day.)

So what's the problem? Mathematics clearly maps very elegantly to expressing the logic and the calculations needed to get algorithms formalized for classical computers. And this mathematics maps very nicely onto todays mainstream programming languages.

Well, buried deep inside this beautiful mapping are some ugly truths that manifest themselves as soon as you go from written software to shipped software. To see these truths we will take a really small example of an algorithm. Here it is:
“Let y be the value of f(x)
If y is 0
then set x to 0
otherwise set x to 1”

The meaning of the logic here doesn't matter and it doesn't matter what f(x) actually calculates. All we need is something that has some assignments and some conditional logic such as the above snippet.

Now ask any programmer to code this in Python or C++ or Java or and they will be done expressing the algorithm in their coding environment in a matter of minutes. It is mostly a question of adding the write “boilerplate” code around the edges, and finding whatever the correct syntax is in the chosen programming language for “if” and for “then” and for demarcating statements and expressing assignments etc.

But in order to ship the code to production items such as error handling, reliability, scalability, predictability.... – sometimes referrred to as the “ilities” of programming end up taking up a lot of time and a lot of coding. So much so that the coding for “ilities” that needs to surround shipped code is often many times larger that the lines of code required for the original purely mathematical mapping into the programming language.

All of this ancilliary code – itself liberally infused with its own assignments and conditional logic – becomes part of the total code the needs to be created to ship code and most of it needs to be managed for the life time of the core code itself. So now we have code for numeric overflows, function call timeouts, exception handlers etc. We have code for builds, running test scripts, shipping to production, monitoring, tracing...the list goes on and on.

The pure world of pure math rarely needs to have any of these as concerns because in math we say “let x = f(x)” without worrying if f(x) will actually fall over and not give us an answer at all, or, perhaps worse, work fine for a year and then start getting slower and slower for some unknown reason.

This second layer of code – the code that surrounds the “pure” code is very hard to quantify. Its very hard to explain to non-programmers how important it might prove to be, how much time it might take and then to make matters worse, it is very unusual to be able to say its “done” in any formal sense. There are always loose ends. Error conditions that code doesn't handle – either because they are believed to be highly unlikely or because there are an open ended set of potential error scenarios and its simply not possible to code for every conceivable eventuality.

Pure math is a land of zero computational latency. A land where calculations are independent of each other and cannot interfere with each other. A land where all communications pathways are 100% reliable. A land where numbers have infinite precision. A land of infinite storage capacity. A land where the power never dies...etc. Etc.

All this is to make the point that in my opinion that for all the appealing mapping from pure math to pure algorithms, actual computer programming involves adding many other layers to cater for the fact that the real world of shipped code is not a pure math “machine”.

Next up. My favorite subject. Change with respect to time....


Friday, August 31, 2018

An alternative model of computer programming : Part 1

Today, I googled "How many programming languages are there?" and the first hit I got said, "256".

I giggled - as any programmer would when a power of two pops up in the wild like that. Of course, it is not possible to say exactly how many because new ones are invented almost every day and it really depends on how you define "language"...It is definitely in the hundreds at least.

It is probably in the thousands, if you rope in all the DSLs and all the macro-pre-processors-and-front-ends-that-spit-out-Java-or-C.

In this series of blog posts I am going to ask myself and then attempt to answer an odd question. Namely, "what if language is not the best starting point for thinking about computer programming?"

Before I get into the meat of that question, I will start with how I believe we got to the current state of affairs - the current programming linguistic tower of Bable - with its high learning curve to enter its hallowed walls. With all its power and the complexities that seem to be inevitable in accessing that power.

I believe we got here the day we decided that computing was best modelling with mathematics.

Friday, July 27, 2018

The day I found Python....

It was 21 years ago. 1997. I was at an SGML conference in Boston (http://xml.coverpages.org/xml97Highlights.html). It was the conference where the XML spec. was launched.

Back in those days I mostly coded in Perl and C++ but was dabbling in the dangerous territory known as "write your own programming language"...

On the way from my hotel to a restaurant one evening I took a shortcut and stumbled upon a bookshop. I don't walk past bookshops unless they are closed. This one was open.

I found the IT section and was scanning a shelf of Perl books. Perl, Perl, Perl, Perl, Python, Perl....

Wait! What?

A misfiled book....Name seems familiar. Why? Ah, Henry Thomson. SGML Europe. Munich 1996. I attended Henry's talk where he shows some of his computational linguistics work. At first glance his screen looked like the OS had crashed, but after a little while I began to see that it was Emacs with command shell windows and the command line invocation of scripts, doing clever things with markup, in Python. Very productive setup fusing editor and command line...

I bought the mis-filed Python book in Boston that day and read it on the way home. By the time I landed in Dublin it was clear to me that Python was my programming future.  It gradually replaced all my Perl and C++ and today, well, Python is everywhere.




Monday, July 23, 2018

Thinking about Software Architecture & Design : Part 14

Of all the acronyms associated with software architecture and design, I suspect that CRUD (Create Read/Report Update Delete) is the most problematic. It is commonly used as a very useful sanity check to ensure that every entity/object created in an architecture is understood in terms of the four fundamental operations : creating, reading, updating and deleting. However, it subtly suggests that the effort/TCO of these four operations are on a par with each other.

In my experience the "U" operation – update  – is the one where there are the most “gotchas” lurking. A create operation – by definition – is one per object/entity. Reads are typically harmless (ignoring some scaling issues for simplicity here). Deletes are one per object/entity, again by definition. More complex than reads generally but not too bad. Updates however, often account for the vast majority of operations performed on objects/entities. The vast majority of the life cycle is spent in updates. Not only that, but each update – by definition again – changes the object/entity and in many architectures updates cascade. i.e. updates cause other updates. This is sometimes exponential as updates trigger other updates. It is also sometimes truly complex in the sense that updates end up,through event cascades, causing further updates to the originally updated objects....

I am a big fan of the CRUD checklist to cover off gaps in architectures early on but I have learned through experience that dwelling on the Update use-cases and thinking through the update cascades can significantly reduce the total cost of ownership of many information architectures.

Monday, June 25, 2018

Thinking about Software Architecture & Design : Part 13


Harold Abelson, co-author of the seminal tome Structure and Interpretation Of Computer Programs (SICP) said that “programs must be written for people to read, and only incidentally for machines to execute.”

The importance of human-to-human communication over human-to-machine is even more true in Software Architectures, where there is typically another layer or two of resolution before machines can interpret the required architecture.

Human-to-human communications is always fraught with potential for miscommunications and the reasons for this run very deep indeed. Dig into this subject and it is easy to be amazed that anything can be communicated perfectly at all. It is a heady mix of linguistics, semiotics, epistemology and psychology. I have written before (for example, in the “What is Law Series - http://seanmcgrath.blogspot.com/2017/06/what-is-law-part-14.html) about the first three of these, but here I want to talk about the fourth – psychology.

I had the good fortune many years ago to stumble upon the book Inevitable Illusions by Massimo Piattelli-Palmarini and it opened my mind to the idea that there are mental concepts we are all prone to develop, that are objectively incorrect – yet unavoidable. Think of your favorite optical illusion. At first you were amazed and incredulous. Then you read/discovered how it works. You proved to your own satisfaction that your eyes were deceiving you. And yet, every time you look at the optical illusion, your brain has another go at selling you on the illusion. You cannot switch it off. No amount of knowing how you are being deceived by your eyes will get your eyes to change their minds, so to speak.

I have learned over the years that some illusions about computing are so strong that it is often best to incorporate them into architectures rather than try to remove them. For example, there is the “send illusion”. Most of the time when there is an arrow between A and B in a software architecture, there is a send illusion lurking. The reason being, it is not possible to send digital bits. They don't move through space. Instead they are replicated. Thus every implied “send” in an architecture can never be a truly simple send operation and it involves at the very least, a copy followed by a delete. 

Another example is the idea of a finite limit to the complexity of business rules. This is the very (very!) appealing idea that with enough refinement, it is possible to arrive at a full expression of the business rules that express some desirable computation. This is sometimes true (especially in text books) which adds to the power of the inevitable illusion. However, in many cases this is only true if you can freeze requirements – a tough proposition – and often is impossible even then. For example in systems where there is a feedback loop between the business rules and the data creating a sort of “fractal boundary” that the corpus of business rules can never fully cover.

I do not let these concerns stop me from using concepts like “send” and “business rule repository” in my architectures because I know how powerfully these concepts are locked into all our minds. However, I do try to conceptualize them as analogies and remain conscious of the tricks my mind plays with them. I then seek to ensure that the implementation addresses the unavoidable delta between the inevitable illusion in my head and the reality in the machine.


Thursday, June 14, 2018

Thinking about Software Architecture & Design : Part 12


The word “flexible” gets used a lot in software architecture & design. It tends to get used in a positive sense. That is, "flexibility" is mostly seen as a good thing to have in your architecture.

And yet, flexibility is very much a two edged sword. Not enough of it, and your architecture can have difficulty dealing with the complexities that typify real world situations. Too much of it and your architecture can be too difficult to understand and maintain. The holy grail of flexibility, in my opinion, is captured in the adage that “simple things should be simple, and hard things should be possible.”.

Simple to say, hard to do. Take SQL for example, or XSLT or RPG...they all excel at making simple things simple in their domains and yet, can also be straitjackets when more complicated things come along. By “complicated” here I mean things that do not neatly fit into their conceptual models of algorithmics and data.

A classic approach to handling this is to allow such systems to be embedded in Turing Complete Programming language. i.e. SQL inside C Sharp. XSLT inside Java etc. The Turing Completeness of the programming language host ensures that the “hard things are possible” while the core – and now “embedded system” - ensures that the simple things are simple.

Unfortunately what tends to happen is that the complexity of the real world chips away at the split between simple and complex and, often times, such hybrid systems evolve into Turing Complete hosts. i.e. over time, the embedded system for handling the simple cases, is gradually eroded and then one day, you wake up to find that it is all written in C# or Java or whatever and the originally embedded system is withering on the vine.

A similar phenomenon happens on the data side where an architecture might initially by 98% “structured” fields but over time, the “unstructured” parts of its data model grow and grow to the point where the structured fields atrophy and all the mission critical data migrates over to the unstructured side. This is why so many database-centric systems organically grow memo fields, blob fields or even complete distinct document storage sub-systems over time, to handle all the data that does not fit neatly into the “boxes” of the structured fields.

Attempting to add flexibility to the structured data architecture tends to result in layers of abstraction that people have difficult following. Layers of pointer indirection. Layers of subject/verb/object decomposition. Layers of relationship reification and so on....

This entropy growth does not happen overnight. The complexity of modelling the real world chips away at designs until at some point there is an inflection. Typically this point of inflection manifests in a desire to “simplify” or “clean up” a working system. This often results in a new architecture that incorporates the learnings from the existing system and then the whole process repeats again. I have seen this iteration work at the level of decades but in more recent years the trend appears to be towards shorter and short cycle times.

This cyclic revisiting of architectures begs the obvious teleological question about the end point of this cycle. Does it have an end? I suspect not because, in a Platonic sense, the ideal architecture can be contemplated but cannot be achieved in the real world.

Besides, even if it could be achieved, the ever-changing and monotonically increasing complexity of the real world ensures that a perfect model for time T can only be achieved at some future time-point T+N, by which time, it is outdated and has been overtaken by the every shifting sands of reality.


So what is an architect to do if this is the case? I have come to the conclusion that it is very very important to be careful to label anything as an immutable truth in an architecture. All nouns, verbs, adjectives etc. that sound to you like that are “facts” of the real world, will, at some point bend under the weight of constant change and necessarily incomplete empirical knowledge.

The smaller the set of things you consider immutable facts, the more flexible your architecture will be. By all means, layer abstractions on top of this core layer. By all means add Turing Completeness into the behavioral side of the model. But treat all of these higher layers as fluid. It is not that they might need to change it is that they will need to change. It is just a question of time.

Finally, there are occasions where the set of core facts in your model is the empty set! Better to work with this reality than fight against it because entropy is the one immutable fact you can absolutely rely on. Possibly the only thing you can have an the core of your architecture and not worry about it being invalidated by the arrival of new knowledge or the passage of time.


Friday, June 01, 2018

Thinking about Software Architecture & Design : Part 11


It is said that there are really only seven basic storylines and that all stories can either fit inside them or be decomposed into some combination of the basic seven. There is the rags-to-riches story. The voyage and return story. The overcoming the monster story...and so on.

I suspect that something similar applies to Software Architecture & Design. When I was a much younger practitioner in this field, I remember a very active field with new methodologies/paradigms coming along on a regular basis. Thinkers such as Yourdon, de Marco, Jackson, Booch, Hoare, Dijkstra, Hohpe distilled the essence of most of the core architecture patterns we know of today.

In more recent years, attention appears to have moved away from the discovery/creation of new architecture patterns and architecture methodologies towards concerns closer to the construction aspects of software. There is an increasing emphasis on two way flows in the creation of architectures– or perhaps circular flows would be a better description. i.e. iterating backwards from, for example user stories, to the abstractions required to support the user stories. Then perhaps a forward iteration refactoring the abstractions to get coverage of the required user stories with less “moving parts” as discussed before.

There has also been a marked trend towards embracing the volatility of the IT landscape in the form of proceeding to software build phases with “good enough” architectures and the conscious decision to factor-in the possibility of needing complete architecture re-writes in ever short time spans.

I suspect this is an area where real world physical architecture and software architecture fundamentally differ and the analogy breaks down. In the physical world, once the location of the highway is laid down and construction begins, a cascade of difficult-to-reverse events starts to occur in parallel with the construction of the highway. Housing estates and commercial areas pop up close to the highway. Urban infrastructure plans – perhaps looking decades into the future – are created predicated on the route of the highway and so on.

In software, there are often similar amount of knock-on effects to architecture changes but when these items are themselves primarily software, rearranging everything based on a architecture is more manageable. Still likely a significant challenge, but more doable because software is, well “softer” than real world concrete, bricks and mortar.

My overall sense of where software architecture is today is that it revolves around the question : “how can we make it easier to fundamentally change the architecture in the future?” The fierce competitive landscape for software has combined with cloud computing to fuel this burning question.

Creating software solutions with very short (i.e. weeks) time horizons before they change again is now possible and increasingly commonplace. The concept of version number is becoming obsolete. Today's software solution may or may not be the same as the one you interacted with yesterday and it may, in fact, be based on an utterly different architecture under the hood than it was yesterday. Modern communications infrastructure, OS/device app stores, auto-updating applications, thin clients...all combine to create a very fluid environment for modern day software architectures to work in.

Are there new software patterns still emerging since the days of data flow and ER diagrams and OOAD? Are we just re-combining the seven basic architectures in a new meta-architecture which is concerned with architecture change rather than architecture itself? Sometimes I think so.

I also find myself wondering where we go next if that is the case. I can see one possible end point for this. An end-point which I find tantalizing and surprising in equal measure. My training in software architecture – the formal parts and the decades of informal training since then – have been based on the idea that the fundamental job of the software architect is to create a digital model – a white box – of some part of the real world, such that the model meets a set of expectations in terms of its interaction with its users (which may, be other digital models).

In modern day computing, this idea of the white box has an emerging alternative which I think of as the black box. If a machine could somehow be instructed to create the model that goes inside the box – based purely on an expression of its required interactions with the rest of the world – then you basically have the only architecture you will ever need for creating what goes into these boxes. The architecture that makes all the other architectures unnecessary if you like.

How could such a thing be constructed? A machine learning approach, based on lots and lots of input/output data? A quantum computing approach which tries an infinity of possible Turing machine configurations, all in parallel? Even if this is not possible today, could it be possible in the near future? Would the fact that boxes constructed this way would be necessarily black – beyond human comprehension at the control flow level – be a problem? Would the fact that we can never formally prove the behavior of the box be a problem? Perhaps not as much as might be initially thought, given the known limitations of formal proof methods for traditionally constructed systems. After all, we cannot even tell if a process will halt, regardless of how much access we have to its internal logic. Also, society seems to be in the process of inuring itself to the unexplainability of machine learning – that genie is already out of the bottle. I have written elsewhere (in the "what is law?" series - http://seanmcgrath.blogspot.com/2017/07/what-is-law-part-15.html) that we have the same “black box” problem with human decision making anyway).

To get to such a world, we would need much better mechanism for formal specification. Perhaps the next generation of software architects will be focused on patterns for expressing the desired behavior of the box, not models for how the behavior itself can be achieved. A very knotty problem indeed but, if it can be achieved, radical re-arrangements of systems in the future could start and effective stop with updating the black box specification with no traditional analysis/design/ construct/test/deploy cycle at all.