Today, I googled "How many programming languages are there?" and the first hit I got said, "256".
I giggled - as any programmer would when a power of two pops up in the wild like that. Of course, it is not possible to say exactly how many because new ones are invented almost every day and it really depends on how you define "language"...It is definitely in the hundreds at least.
It is probably in the thousands, if you rope in all the DSLs and all the macro-pre-processors-and-front-ends-that-spit-out-Java-or-C.
In this series of blog posts I am going to ask myself and then attempt to answer an odd question. Namely, "what if language is not the best starting point for thinking about computer programming?"
Before I get into the meat of that question, I will start with how I believe we got to the current state of affairs - the current programming linguistic tower of Bable - with its high learning curve to enter its hallowed walls. With all its power and the complexities that seem to be inevitable in accessing that power.
I believe we got here the day we decided that computing was best modelling with mathematics.
Featured Post
These days, I mostly post my tech musings on Linkedin. https://www.linkedin.com/in/seanmcgrath/
Friday, August 31, 2018
Friday, July 27, 2018
The day I found Python....
It was 21 years ago. 1997. I was at an SGML conference in Boston (http://xml.coverpages.org/xml97Highlights.html). It was the conference where the XML spec. was launched.
Back in those days I mostly coded in Perl and C++ but was dabbling in the dangerous territory known as "write your own programming language"...
On the way from my hotel to a restaurant one evening I took a shortcut and stumbled upon a bookshop. I don't walk past bookshops unless they are closed. This one was open.
I found the IT section and was scanning a shelf of Perl books. Perl, Perl, Perl, Perl, Python, Perl....
Wait! What?
A misfiled book....Name seems familiar. Why? Ah, Henry Thomson. SGML Europe. Munich 1996. I attended Henry's talk where he shows some of his computational linguistics work. At first glance his screen looked like the OS had crashed, but after a little while I began to see that it was Emacs with command shell windows and the command line invocation of scripts, doing clever things with markup, in Python. Very productive setup fusing editor and command line...
I bought the mis-filed Python book in Boston that day and read it on the way home. By the time I landed in Dublin it was clear to me that Python was my programming future. It gradually replaced all my Perl and C++ and today, well, Python is everywhere.
Back in those days I mostly coded in Perl and C++ but was dabbling in the dangerous territory known as "write your own programming language"...
On the way from my hotel to a restaurant one evening I took a shortcut and stumbled upon a bookshop. I don't walk past bookshops unless they are closed. This one was open.
I found the IT section and was scanning a shelf of Perl books. Perl, Perl, Perl, Perl, Python, Perl....
Wait! What?
A misfiled book....Name seems familiar. Why? Ah, Henry Thomson. SGML Europe. Munich 1996. I attended Henry's talk where he shows some of his computational linguistics work. At first glance his screen looked like the OS had crashed, but after a little while I began to see that it was Emacs with command shell windows and the command line invocation of scripts, doing clever things with markup, in Python. Very productive setup fusing editor and command line...
I bought the mis-filed Python book in Boston that day and read it on the way home. By the time I landed in Dublin it was clear to me that Python was my programming future. It gradually replaced all my Perl and C++ and today, well, Python is everywhere.
Monday, July 23, 2018
Thinking about Software Architecture & Design : Part 14
Of all the acronyms associated with software architecture and design, I suspect that CRUD (Create Read/Report Update Delete) is the most problematic. It is commonly used as a very useful sanity check to ensure that every entity/object created in an architecture is understood in terms of the four fundamental operations : creating, reading, updating and deleting. However, it subtly suggests that the effort/TCO of these four operations are on a par with each other.
In my experience the "U" operation – update – is the one where there are the most “gotchas” lurking. A create operation – by definition – is one per object/entity. Reads are typically harmless (ignoring some scaling issues for simplicity here). Deletes are one per object/entity, again by definition. More complex than reads generally but not too bad. Updates however, often account for the vast majority of operations performed on objects/entities. The vast majority of the life cycle is spent in updates. Not only that, but each update – by definition again – changes the object/entity and in many architectures updates cascade. i.e. updates cause other updates. This is sometimes exponential as updates trigger other updates. It is also sometimes truly complex in the sense that updates end up,through event cascades, causing further updates to the originally updated objects....
I am a big fan of the CRUD checklist to cover off gaps in architectures early on but I have learned through experience that dwelling on the Update use-cases and thinking through the update cascades can significantly reduce the total cost of ownership of many information architectures.
In my experience the "U" operation – update – is the one where there are the most “gotchas” lurking. A create operation – by definition – is one per object/entity. Reads are typically harmless (ignoring some scaling issues for simplicity here). Deletes are one per object/entity, again by definition. More complex than reads generally but not too bad. Updates however, often account for the vast majority of operations performed on objects/entities. The vast majority of the life cycle is spent in updates. Not only that, but each update – by definition again – changes the object/entity and in many architectures updates cascade. i.e. updates cause other updates. This is sometimes exponential as updates trigger other updates. It is also sometimes truly complex in the sense that updates end up,through event cascades, causing further updates to the originally updated objects....
I am a big fan of the CRUD checklist to cover off gaps in architectures early on but I have learned through experience that dwelling on the Update use-cases and thinking through the update cascades can significantly reduce the total cost of ownership of many information architectures.
Monday, June 25, 2018
Thinking about Software Architecture & Design : Part 13
Harold Abelson,
co-author of the seminal tome Structure and Interpretation Of
Computer Programs (SICP) said that “programs must be written for
people to read, and only incidentally for machines to execute.”
The importance of
human-to-human communication over human-to-machine is even more true in Software Architectures, where there is typically another layer or two
of resolution before machines can interpret the required
architecture.
Human-to-human communications is always fraught with
potential for miscommunications and the reasons for this run very deep indeed. Dig into this subject and it is easy to be amazed that anything
can be communicated perfectly at all. It is a heady mix of
linguistics, semiotics, epistemology and psychology. I have written
before (for example, in the “What is Law Series -
http://seanmcgrath.blogspot.com/2017/06/what-is-law-part-14.html)
about the first three of these, but here I want to talk about the
fourth – psychology.
I had the good fortune
many years ago to stumble upon the book Inevitable Illusions by
Massimo Piattelli-Palmarini and it opened my mind to the idea that
there are mental concepts we are all prone to develop, that are
objectively incorrect – yet unavoidable. Think of your favorite
optical illusion. At first you were amazed and incredulous. Then you
read/discovered how it works. You proved to your own satisfaction
that your eyes were deceiving you. And yet, every time you look at
the optical illusion, your brain has another go at selling you on the
illusion. You cannot switch it off. No amount of knowing how you are
being deceived by your eyes will get your eyes to change their minds,
so to speak.
I have learned over the
years that some illusions about computing are so strong that it is
often best to incorporate them into architectures rather than try to
remove them. For example, there is the “send illusion”. Most of
the time when there is an arrow between A and B in a software
architecture, there is a send illusion lurking. The reason being, it is
not possible to send digital bits. They don't move through
space. Instead they are replicated. Thus every implied “send” in
an architecture can never be a truly simple send operation and it
involves at the very least, a copy followed by a delete.
Another example is the
idea of a finite limit to the complexity of business rules. This is
the very (very!) appealing idea that with enough refinement, it is
possible to arrive at a full expression of the business rules that
express some desirable computation. This is sometimes true
(especially in text books) which adds to the power of the inevitable
illusion. However, in many cases this is only true if you can freeze
requirements – a tough proposition – and often is impossible even
then. For example in systems where there is a feedback loop between
the business rules and the data creating a sort of “fractal
boundary” that the corpus of business rules can never fully cover.
I do not let these
concerns stop me from using concepts like “send” and “business
rule repository” in my architectures because I know how powerfully
these concepts are locked into all our minds. However, I do try to
conceptualize them as analogies and remain conscious of the tricks my mind plays with them. I then seek to ensure that the
implementation addresses the unavoidable delta between the inevitable illusion in my head and the reality in the machine.
Thursday, June 14, 2018
Thinking about Software Architecture & Design : Part 12
The word “flexible” gets used a lot in software architecture & design. It tends to get used in a positive sense. That is, "flexibility" is mostly seen as a good thing to have in your architecture.
And yet, flexibility is very much a two edged sword. Not enough of it, and your architecture can have difficulty dealing with the complexities that typify real world situations. Too much of it and your architecture can be too difficult to understand and maintain. The holy grail of flexibility, in my opinion, is captured in the adage that “simple things should be simple, and hard things should be possible.”.
Simple to say, hard to do. Take SQL for example, or XSLT or RPG...they all excel at making simple things simple in their domains and yet, can also be straitjackets when more complicated things come along. By “complicated” here I mean things that do not neatly fit into their conceptual models of algorithmics and data.
A classic approach to handling this is to allow such systems to be embedded in Turing Complete Programming language. i.e. SQL inside C Sharp. XSLT inside Java etc. The Turing Completeness of the programming language host ensures that the “hard things are possible” while the core – and now “embedded system” - ensures that the simple things are simple.
Unfortunately what tends to happen is that the complexity of the real world chips away at the split between simple and complex and, often times, such hybrid systems evolve into Turing Complete hosts. i.e. over time, the embedded system for handling the simple cases, is gradually eroded and then one day, you wake up to find that it is all written in C# or Java or whatever and the originally embedded system is withering on the vine.
A similar phenomenon happens on the data side where an architecture might initially by 98% “structured” fields but over time, the “unstructured” parts of its data model grow and grow to the point where the structured fields atrophy and all the mission critical data migrates over to the unstructured side. This is why so many database-centric systems organically grow memo fields, blob fields or even complete distinct document storage sub-systems over time, to handle all the data that does not fit neatly into the “boxes” of the structured fields.
Attempting to add flexibility to the structured data architecture tends to result in layers of abstraction that people have difficult following. Layers of pointer indirection. Layers of subject/verb/object decomposition. Layers of relationship reification and so on....
This entropy growth does not happen overnight. The complexity of modelling the real world chips away at designs until at some point there is an inflection. Typically this point of inflection manifests in a desire to “simplify” or “clean up” a working system. This often results in a new architecture that incorporates the learnings from the existing system and then the whole process repeats again. I have seen this iteration work at the level of decades but in more recent years the trend appears to be towards shorter and short cycle times.
This cyclic revisiting of architectures begs the obvious teleological question about the end point of this cycle. Does it have an end? I suspect not because, in a Platonic sense, the ideal architecture can be contemplated but cannot be achieved in the real world.
Besides, even if it could be achieved, the ever-changing and monotonically increasing complexity of the real world ensures that a perfect model for time T can only be achieved at some future time-point T+N, by which time, it is outdated and has been overtaken by the every shifting sands of reality.
So what is an architect to do if this is the case? I have come to the conclusion that it is very very important to be careful to label anything as an immutable truth in an architecture. All nouns, verbs, adjectives etc. that sound to you like that are “facts” of the real world, will, at some point bend under the weight of constant change and necessarily incomplete empirical knowledge.
The smaller the set of things you consider immutable facts, the more flexible your architecture will be. By all means, layer abstractions on top of this core layer. By all means add Turing Completeness into the behavioral side of the model. But treat all of these higher layers as fluid. It is not that they might need to change it is that they will need to change. It is just a question of time.
Finally, there are occasions where the set of core facts in your model is the empty set! Better to work with this reality than fight against it because entropy is the one immutable fact you can absolutely rely on. Possibly the only thing you can have an the core of your architecture and not worry about it being invalidated by the arrival of new knowledge or the passage of time.
Friday, June 01, 2018
Thinking about Software Architecture & Design : Part 11
It is said that there
are really only seven basic storylines and that all stories can
either fit inside them or be decomposed into some combination of the
basic seven. There is the rags-to-riches story. The voyage and return
story. The overcoming the monster story...and so on.
I suspect that
something similar applies to Software Architecture & Design.
When I was a much younger practitioner in this field, I remember a
very active field with new methodologies/paradigms coming along on a
regular basis. Thinkers such as Yourdon, de Marco, Jackson, Booch,
Hoare, Dijkstra, Hohpe distilled the essence of most of the core
architecture patterns we know of today.
In more recent years,
attention appears to have moved away from the discovery/creation of
new architecture patterns and architecture methodologies towards
concerns closer to the construction aspects of software. There is an
increasing emphasis on two way flows in the creation of
architectures– or perhaps circular flows would be a better
description. i.e. iterating backwards from, for example user stories,
to the abstractions required to support the user stories. Then
perhaps a forward iteration refactoring the abstractions to get
coverage of the required user stories with less “moving parts” as
discussed before.
There has also been a
marked trend towards embracing the volatility of the IT landscape in
the form of proceeding to software build phases with “good enough”
architectures and the conscious decision to factor-in the possibility
of needing complete architecture re-writes in ever short time spans.
I suspect this is an
area where real world physical architecture and software architecture
fundamentally differ and the analogy breaks down. In the physical
world, once the location of the highway is laid down and construction
begins, a cascade of difficult-to-reverse events starts to occur in
parallel with the construction of the highway. Housing estates and
commercial areas pop up close to the highway. Urban infrastructure
plans – perhaps looking decades into the future – are created
predicated on the route of the highway and so on.
In software, there are
often similar amount of knock-on effects to architecture changes but
when these items are themselves primarily software, rearranging
everything based on a architecture is more manageable. Still likely a
significant challenge, but more doable because software is, well
“softer” than real world concrete, bricks and mortar.
My overall sense of
where software architecture is today is that it revolves around the
question : “how can we make it easier to fundamentally change the
architecture in the future?” The fierce competitive landscape for
software has combined with cloud computing to fuel this burning
question.
Creating software
solutions with very short (i.e. weeks) time horizons before they
change again is now possible and increasingly commonplace. The
concept of version number is becoming obsolete. Today's software
solution may or may not be the same as the one you interacted with
yesterday and it may, in fact, be based on an utterly different
architecture under the hood than it was yesterday. Modern
communications infrastructure, OS/device app stores, auto-updating
applications, thin clients...all combine to create a very fluid
environment for modern day software architectures to work in.
Are there new software
patterns still emerging since the days of data flow and ER diagrams
and OOAD? Are we just re-combining the seven basic architectures in a
new meta-architecture which is concerned with architecture change
rather than architecture itself? Sometimes I think so.
I also find myself
wondering where we go next if that is the case. I can see one
possible end point for this. An end-point which I find tantalizing
and surprising in equal measure. My training in software architecture
– the formal parts and the decades of informal training since then
– have been based on the idea that the fundamental job of the
software architect is to create a digital model – a white box –
of some part of the real world, such that the model meets a set of
expectations in terms of its interaction with its users (which may,
be other digital models).
In modern day
computing, this idea of the white box has an emerging alternative
which I think of as the black box. If a machine could somehow be
instructed to create the model that goes inside the box – based
purely on an expression of its required interactions with the rest of
the world – then you basically have the only architecture you will
ever need for creating what goes into these boxes. The architecture
that makes all the other architectures unnecessary if you like.
How could such a thing
be constructed? A machine learning approach, based on lots and lots
of input/output data? A quantum computing approach which tries an
infinity of possible Turing machine configurations, all in parallel?
Even if this is not possible today, could it be possible in the near
future? Would the fact that boxes constructed this way would be
necessarily black – beyond human comprehension at the control flow
level – be a problem? Would the fact that we can never formally
prove the behavior of the box be a problem? Perhaps not as much as
might be initially thought, given the known limitations of formal
proof methods for traditionally constructed systems. After all, we
cannot even tell if a process will halt, regardless of how much
access we have to its internal logic. Also, society seems to be in
the process of inuring itself to the unexplainability of machine
learning – that genie is already out of the bottle. I have written
elsewhere (in the "what is law?" series - http://seanmcgrath.blogspot.com/2017/07/what-is-law-part-15.html)
that we have the same “black box” problem with human decision
making anyway).
To get to such a world,
we would need much better mechanism for formal specification. Perhaps
the next generation of software architects will be focused on
patterns for expressing the desired behavior of the box, not models
for how the behavior itself can be achieved. A very knotty problem
indeed but, if it can be achieved, radical re-arrangements of systems
in the future could start and effective stop with updating the
black box specification with no traditional analysis/design/
construct/test/deploy cycle at all.
Monday, May 28, 2018
Thinking about Software Architecture & Design : Part 10
Once the nouns and verbs I need in my architecture start to solidify, I look at organizing them across multiple
dimensions. I tend to think of the noun/verb organization exercise in the physical
terms of surface area and moving parts. By "surface area" I mean
minimizing the sheer size of the model. I freely admin that page
count is a crude-sounding measure for a software architecture, but I
have found over the years that the total size of the document
required to adequately explain the architecture is an excellent proxy
for its total cost of ownership.
It is vital, for a good
representation of a software architecture, that both the data side and the computation side are
covered. I have seen many architectures where the data side is
covered well but the computation side has many gaps. This is the
infamous “and then magic happens” part of the software
architecture world. It is most commonly seen when there is too much
use of convenient real world analogies. i.e. thematic modules that
just snap together like jigsaw/lego pieces, data layers that sit
perfectly on top of each other like layers of a cake, objects that
nest perfectly inside other objects like Russian Dolls etc.
When I have a document
that I feel adequately reflects both the noun and the verb side of the architecture, I
employ a variety of techniques to minimize its overall size. On the
noun side, I can create type hierarchies to explore how nouns can be
considered special cases of other nouns. I can create relational
de-compositions to explore how partial nouns can be shared by other
nouns. I will typically “jump levels” when I am doing this. i.e.
I will switch between thinking of the nouns in purely abstract terms
(“what is a widget really” to thinking about them in
physical terms: “how best to create/read/update/delete
widgets?”). I think of it as working downwards towards
implementation an upwards towards abstraction at the same
time. It is head hurting at times, but in my experience produces
better practical results that the simpler step-wise refinement
approach of moving incrementally downwards from abstraction to concrete
implementation.
On the verb side, I
tend to focus on the classic engineering concept of "moving parts".
Just as in the physical world, it has been my experience that the
smaller the number of independent moving parts in an architecture,
the better. Giving a lot of thought to opportunities to reduce the
total number of verbs required pays handsome dividends. I think of
it in terms of combinatorics. What are the fundamental operators I
need from which, all the other operators can be created by
combinations of the fundamental operators? Getting to this set of fundamental operators is almost like finding the architecture inside the architecture.
I also think of verbs
in terms of complexity generators. Here I am using the word
“complexity” in the mathematical sense. Complexity is not a
fundamentally bad thing! I would argue that all system behavior has a
certain amount of complexity. The trick with complexity is to find
ways to create the amount required but in a way that allows you to be
in control of it. The compounding of verbs is the workhorse for
complexity generation. I think of data as a resource that undergoes
transformation over time. Most computation – even the simplest
assignment of the value Y to be the value Y + 1 has an implicit time
dimension. Assuming Y is a value that lives over a long period of
time – i.e. is persisted in some storage system – then Y today is
just the compounded result of the verbs applied to it from its date
of creation.
There are two main
things I watch for as I am looking into my verbs and how to compound
them and apply them to my nouns. The first is to always include the
ability to create an ad-hoc verb “by hand”. By which I mean, always
having the ability to edit the data in nouns using purely interactive
means. This is especially important in systems where down-time for
the creation of new algorithmic verbs is not an option.
The second is
watching out for feedback/recursion in verbs. Nothing generates complexity
faster than feedback/recursion and when it is it used, it must be used
with great care. I have a poster on my wall of a fractal with its simple
mathematical formula written underneath it. It is incredible that
such bottomless complexity can be derived from such a harmless
looking feedback loop. Using it wisely can produce architectures capable of highly complex behaviors but with small surface areas and few moving parts. Used unwisely.....
Monday, May 21, 2018
Thinking about Software Architecture & Design : Part 9
I approach software architecture through the
medium of human language. I do make liberal use of diagrams but the
diagrams serve as illustrators of what, to me, is always a linguistic
conceptualization of a software architecture. In other words, my
mental model is nouns and verbs and adjectives and adverbs. I look
for nouns and verbs first. This is the dominant decomposition for me.
What are that things that exist in the model? Then I look for what
actions are performed on/by the things in the model. (Yes, the
actions are also “things” after a fashion...)
This first level of decomposition is obviously very high level and yet, I find it very useful to pause at this level of detail and do a gap analysis. Basically what I do is I explain the model to myself in my head and look for missing nouns and verbs. Simple enough.
But then I ask myself how the data that lives in the digital nouns actually gets there in the first place. Most of the time when I do this, I find something missing in my architecture. There are only finite number of ways data can get into a digital noun in the model. A user can enter it, an algorithm can compute it, or an integration point can supply it. If I cannot explain all the data in the model through the computation/input/integration decomposition, I am most likely missing something.
Another useful question I ask at this level of detail relates to where the data in the nouns goes outside the model. In most models, data flows out at some point to be of use i.e. it hits a screen or a printout or an outward bound integration point. Again, most of the time when I do this analysis, I find something missing – or something in the model that does not need to be there at all.
Getting your nouns and verbs straight is a great first step towards what will ultimately take the form of objects/records and methods/functions/procedures. It is also a great first step if you are taking a RESTian approach to architecture as the dividing line between noun-thinking and verb-thinking is the key difference between REST and RPC in my experience.
It is hard to avoid prematurely clustering nouns into types/classes as our brains appear to be wired towards organizing things into hierarchies. I do this because I find that as soon as I start thinking hierarchically, I close off the part of my brain that is open to alternative hierarchical decompositions. I try to avoid that because in my experience, the set of factors that steer an architecture towards one hierarchy instead of another are practical ones, unrelated to “pure” data modelling. i.e. concerns related to organizational boundaries, integration points, cognitive biases etc.
Take the time to explore as many noun/verb decompositions as you can because as soon as you pick one and start to refine the model, it becomes increasingly hard to think “outside the box” of your own architecture.
This first level of decomposition is obviously very high level and yet, I find it very useful to pause at this level of detail and do a gap analysis. Basically what I do is I explain the model to myself in my head and look for missing nouns and verbs. Simple enough.
But then I ask myself how the data that lives in the digital nouns actually gets there in the first place. Most of the time when I do this, I find something missing in my architecture. There are only finite number of ways data can get into a digital noun in the model. A user can enter it, an algorithm can compute it, or an integration point can supply it. If I cannot explain all the data in the model through the computation/input/integration decomposition, I am most likely missing something.
Another useful question I ask at this level of detail relates to where the data in the nouns goes outside the model. In most models, data flows out at some point to be of use i.e. it hits a screen or a printout or an outward bound integration point. Again, most of the time when I do this analysis, I find something missing – or something in the model that does not need to be there at all.
Getting your nouns and verbs straight is a great first step towards what will ultimately take the form of objects/records and methods/functions/procedures. It is also a great first step if you are taking a RESTian approach to architecture as the dividing line between noun-thinking and verb-thinking is the key difference between REST and RPC in my experience.
It is hard to avoid prematurely clustering nouns into types/classes as our brains appear to be wired towards organizing things into hierarchies. I do this because I find that as soon as I start thinking hierarchically, I close off the part of my brain that is open to alternative hierarchical decompositions. I try to avoid that because in my experience, the set of factors that steer an architecture towards one hierarchy instead of another are practical ones, unrelated to “pure” data modelling. i.e. concerns related to organizational boundaries, integration points, cognitive biases etc.
Take the time to explore as many noun/verb decompositions as you can because as soon as you pick one and start to refine the model, it becomes increasingly hard to think “outside the box” of your own architecture.
Friday, May 18, 2018
Thinking about Software Architecture & Design : Part 8
It is common practice
to communicate software architectures using diagrams, but most
diagrams, in my experience are at best rough analogies of the architecture rather than faithful representations of it.
All analogies
break down at some point. That is why we call them “analogies”.
It is a good idea to understand where your analogies break down and
find ways to compensate.
In my own architecture
work, the main breakdown point for diagrams is that architectures in my head are more like movies than static pictures. In my minds eye, I tend to see data
flowing. I tend to see behaviors – both human and algorithmic –
as animated actors buzzing around a 3D space, doing things, producing and
consuming new data. I see data flowing, data flowing out, data
staying put but changing shape over time, I see feedback loops where
data flows out but then comes back in again. I see the impact of time
in a number of different dimensions. I see how it relates to the
execution paths of the system. I see how it impacts the evolution of
the system as requirements change. I see how it impacts the
dependencies of the system that are outside of my control e.g.
operating systems etc.
Any static two
dimensional picture or set of pictures, that I take of this architecture
necessarily leaves a lot of information behind. I liken it to taking
a photo of a large city at 40,000 feet and then trying to explain all
that is going on in that city, through that static photograph. I can
take photos from different angles and that will help but, at the end
of the day, what I would really like is a movable camera and the
ability to walk/fly around the “city” as a way of communicating
what is going on in it, and how it is architected to function. Some
day...
A useful rule of thumb
is that most boxes, arrows, straight lines and layered constructions
in software architecture diagrams are just rough analogies. Boxes separating
say, organizations in a diagram, or software modules or business
processes are rarely so clean in reality. A one way arrow from X to Y
is probably in reality a two way data flow and it probably has a
non-zero failure rate. A straight line separating, say “valid”
from “invalid” data records probably has a sizable grey area in
the middle for data records that fuzzily sit in between validity and
invalidity. And so on.
None of this is in any
way meant to suggest that we stop using diagrams to communicate and think about architectures. Rather, my goal
here is just to suggest that until we have better tools for
communicating what architectures really are, we all bear in mind the
limited ability of static 2D diagrams to accurately reflect them.
Thursday, May 10, 2018
Thinking about Software Architecture & Design : Part 7
The temptation to focus
a lot of energy on the one killer diagram that captures the essence
of your architecture is strong. How many hours have I spent in
Visio/Powerpoint/draw.io on “the diagram”? More than I would like to
admit to.
Typically, I see
architectures that have the “main diagram” and then a series of
detail diagrams hidden away, for use by implementation and design
teams. The “main diagram” is the one likely to go into the
stakeholder presentation deck.
This can works fine
when there are not many stakeholders and organizational boundaries
are not too hard to traverse. But as the number of stakeholders
grows, the power of the single architectural view diminishes.
Sometimes, in order to be applicable to all stakeholders, the diagram becomes so generic that it really says very little i.e.
the classic three-tiered architecture or the classic hub-and-spoke or
the peer-to-peer network. Such diagrams run the risk of not being
memorable by any of the stakeholders, making it difficult for them to get invested in it.
Other times, the
diagram focuses on one particular “view” perhaps by putting one
particular stakeholder role in the center of the diagram, with the roles
of the other stakeholders surrounding the one in the middle.
This approach can be
problematic in my experience. Even if you take great pains to point
out that there is no implied hierarchy of importance in the
arrangement of the diagram, the role(s) in the middle of the diagram
will be seen as more important. It is a sub-conscious assessment. We
cannot help it. The only exception I know of is when flow-order is
explicit in the diagram but even then whatever is in the middle of the diagram draws our attention.
In most architectures
there are “asks” of the stakeholders. The best way to achieve these
“asks” in my experience is to ensure that each stakeholder gets
their own architecture picture, that has their role in the center in
the diagram, with all other roles surrounding their part in the big
picture.
So, for N stakeholders
there are N "main views" - not just one. All compatible ways of looking at the same thing.
All designed to make it easier for each stakeholder to answer the
“what does this mean for me?” question which is always there –
even if it is not explicitly stated.
Yes, it is a pain to
manage N diagrams but you probably have them anyway – in the
appendices most likely, for the attention of the design and implementation phase. My suggestion is to take them out of the
appendices and put them into the stakeholder slide deck.
I typically present two
diagrams to each stakeholder group. Slide one is the diagram that
applies to all stakeholders. Slide two is for the stakeholder group I
am presenting to. As I move around the different stakeholder
meetings, I swap out slide number two.
Tuesday, May 08, 2018
Thinking about Software Architecture & Design : Part 6
Abstractions are a
two-edged sword in software architecture. Their power must be
weighed against their propensity to ask too much from stakeholders, who may not have the time or inclination to fully internalize them. Unfortunately most
abstractions require internalization for appreciation of their power.
In mathematics, consider the power of Euler's equation
and contrast it with the effort involved to understand what its simple looking component symbols represent. In music, consider the power of the
Grand Staff to represent musical compositions and contrast that with the effort required to understand what its simple looking symbols represent.
Both of these abstractions are
both very demanding and very powerful. It is not uncommon in software
architecture for the practitioner community to enjoy and be
comfortable with constantly internalizing new abstractions. However,
in my experience, a roomful of software architects is not
representative of a roomful of stakeholders in general.
Before you
release your killer abstractions from the whiteboard into Powerpoint,
try them out on a friendly audience of non-specialists first.
Wednesday, May 02, 2018
Thinking about Software Architecture & Design : Part 5
Most architectures will have users at some level or other. Most architectures will also have organisational boundaries that need to be crossed during information flows.
Each interaction with a user and each transition of an organizational boundary is an “ask”. i.e. the system is asking for the cooperation of some external entity. Users are typically being asked to cooperate by entering information into systems. Parties on the far end of integration points are typically being asked to cooperate by turning around information requests or initiating information transfers.
It is a worthwhile exercise while creating an architecture, to tabulate all the “asks” and identify those that do not have associated benefits for those who are performing the asks.
Any entities interacting with the system that are giving more than they are receiving, are likely to be the most problematic to motivate to use the new system. In my experience, looking for ways to address this at architecture time can be very effective and sometimes very easy. The further down the road you get towards implementation, the harder it is to address motivational imbalances.
If you don't have an answer to the “what is in it for me?” question, for each user interaction and each integration point interaction, your architecture will face avoidable headwinds both in implementation and in operation.
Each interaction with a user and each transition of an organizational boundary is an “ask”. i.e. the system is asking for the cooperation of some external entity. Users are typically being asked to cooperate by entering information into systems. Parties on the far end of integration points are typically being asked to cooperate by turning around information requests or initiating information transfers.
It is a worthwhile exercise while creating an architecture, to tabulate all the “asks” and identify those that do not have associated benefits for those who are performing the asks.
Any entities interacting with the system that are giving more than they are receiving, are likely to be the most problematic to motivate to use the new system. In my experience, looking for ways to address this at architecture time can be very effective and sometimes very easy. The further down the road you get towards implementation, the harder it is to address motivational imbalances.
If you don't have an answer to the “what is in it for me?” question, for each user interaction and each integration point interaction, your architecture will face avoidable headwinds both in implementation and in operation.
Friday, April 27, 2018
Thinking about Software Architecture & Design : Part 4
Any new IT system will necessarily sit inside a larger context. If that context includes the new system having its own identity – from “the new system” to “Project Unity” - it will be anthropomorphised by its users. This can be good, bad or neutral for the success of the new IT system.
It does not matter what form the naming takes e.g. “Project Horizon”, “Bob's System”, “The new HR system” or even a visual identity such as "the new button", or even a tactile identity such as “the new panel under the desk at reception”. In all cases the new IT system may be treated by existing team members in much the same way as a new team member would be treated.
New systems get “sized up”, so to speak, by their users. Attributes such as “fast”, “unreliable”, “inflexible” or even “moody” might be applied to the new system. These may be factually based, or biased, depending on the stance the community of users adopts towards the new system arriving into their team.
One particularly troublesome possibility is that the new system may be seen as causal factor in events unrelated to it. “X used to work fine before the new system came along....” The opposite can also happen i.e. the new system get plaudits for events it had no hand or part in. Causality versus correlation can be a tricky distinction to navigate.
Takeaway: sometimes the human tendency towards the anthropomorphic can be used to your advantage. If you suspect the opposite may be true for your new system, it can be useful to purposely avoid elaborate naming and dramatic rollout events which can exacerbate anthropomorphisation.
Sometimes, new systems are best rolled out with little or no fanfare in a “business as usual” mode. Sometimes it is not possible to avoid big bang system switchover events but if it is at all possible to adopt a phased approach to deployment, and transition slowly, I would recommend it for many reasons, one of which is the sort of team dynamics alluded to here.
As AI/Robotics advances, I think this will become even more important in the years ahead.
Tuesday, April 24, 2018
Thinking about Software Architecture & Design : Part 3
In software architecture and design we have some pretty deep theories that guide us on our way. We know how to watch out for exponential run times, undetectable synchronisation deadlocks, lost update avoidance etc. We have Petri nets, state charts, entity/attribute diagrams, polymorphic object models, statistical queuing models, QA/QC confidence intervals....the list goes on....
...and yet, in my experience, the success of a software architecture & design project tends, in my experience, to revolve around an aspect of the problem domain that is not addressed by any of the above. I call it the experiential delta.
Simply put, the experiential delta is a measure of how different the “to be” system appears to be, to those who will interact with it – experience it – day to day.
A system can have a very high architecture delta but a low experiential delta and be many orders of magnitude easier to get into production than a system with low architecture delta but a high experiential delta.
It pays to know what type of experiential delta your “to be” solution represents. If it has high experiential delta, it pays to put that issue front and center in your planning. Such projects tend to be primarily process change challenges with some IT attached, as opposed to being IT projects with some process change attached.
In my experience, many large IT projects that fail, do not fail for IT reasons per se. They fail for process change reasons, but get labeled as IT failures after the fact. The real source of failure in some cases, is a failure to realize the importance of process change and the need to get process change experts into the room fast. As soon the the size of the experiential delta crystallizes.
Indeed in some situations it is best to lead the project as a process change project, not an IT project at all. Doing so has a wonderful way of focusing attention on the true determinant of success. The petri nets will look after themselves.
Thursday, April 19, 2018
Thinking about Software Architecture & Design : Part 2
Technological
volatility is, in my experience, the most commonly overlooked factor
in software architecture and design. We
have decades worth of methodologies and best practice guides that
help us deal with fundamental aspects of architecture and design such
as reference data management, mapping data flows, modelling processes, capturing
input/output invariants, selecting between synchronous and
asynchronous inter-process communication methods...the list goes on.
And yet, time and again, I have seen software architectures that are only a few years old, that need to be fundamentally revisited. Not because of any significant breakthrough in software architecture & design techniques, but because technological volatility has moved the goal posts, so to speak, on the architecture.
Practical architectures (outside those in pure math such as Turing Machines) cannot exist in a technological vacuum. They necessarily take into account what is going on in the IT world in general. In a world without full text search indexes, document management architectures are necessarily different. In a world without client side processing capability, UI architectures are necessarily different, in a world without always-on connectivity.....and so on.
When I look back at IT volatility over my career – back to the early Eighties – there is a clear pattern in the volatility. Namely, that volatility increases the closer you get to the end-users points of interaction with IT systems. Dumb “green screens”, bit-mapped graphics, personal desktop GUIs, tablets, smart phones, voice activation, haptic user interfaces..
Many of the generational leaps represented by these innovations have had profound implications on the software architectures that leverage them. It is not possible – in my experience – to abstract away user interface volatility and treat it as a pluggable layer on top of the main architecture. End-user technologies have a way of imposing themselves deeply inside architectures. For example, necessitating an event-oriented/multi-threaded approach to data processing in order to make it possible to create responsive GUIs. Responding sychronously to data queries as opposed to batch processing.
The main takeaway is this: creating good software architectures pay dividends but they are much more likely to be significant in the parts of the architecture furthest away from the end-user interactions. i.e. inside the data modelling, inside discrete data processing components etc. They are least likely to pay dividends in areas such as GUI frameworks, client side processing models or end user application programming environments.
In fact, volatility is sometimes so intense, that it makes more sense to not spend time abstracting the end-user aspects of the architecture at all. i.e. sometimes it makes more sense to make a conscious decision to re-do the architecture if/when the next big upheaval comes on the client side and trust that large components of the back-end will remain fully applicable post-upheaval.
That way, your applications will not be as likely to be considered “dated” or “old school” in the eyes of the users, even though you are keeping much of the original back-end architecture from generation to generation.
In general, software architecture thinking time is more profitably spent in the back-end than in the front-end. There is rarely a clean line that separates these so a certain amount of volatility on the back-end is inevitable, but manageable, in compared to the volatility the will be visited upon your front-end architectures.
Volatility exists everywhere of course. For example, at the moment serverless computing models are having profound implications on "server side" architectures. Not because of end-user concerns - end-users do not know or care about these things - but because of the volatility in the economics of cloud computing.
If history is anything to go by, it could be another decade or more before something comes along like serverless computing, that profoundly impacts back-end architectures. Yet in the next decade we are likely to see dozens of major changes in client side computing. Talking to our cars, waving at our heating systems, installing apps subcutaneously etc.
And yet, time and again, I have seen software architectures that are only a few years old, that need to be fundamentally revisited. Not because of any significant breakthrough in software architecture & design techniques, but because technological volatility has moved the goal posts, so to speak, on the architecture.
Practical architectures (outside those in pure math such as Turing Machines) cannot exist in a technological vacuum. They necessarily take into account what is going on in the IT world in general. In a world without full text search indexes, document management architectures are necessarily different. In a world without client side processing capability, UI architectures are necessarily different, in a world without always-on connectivity.....and so on.
When I look back at IT volatility over my career – back to the early Eighties – there is a clear pattern in the volatility. Namely, that volatility increases the closer you get to the end-users points of interaction with IT systems. Dumb “green screens”, bit-mapped graphics, personal desktop GUIs, tablets, smart phones, voice activation, haptic user interfaces..
Many of the generational leaps represented by these innovations have had profound implications on the software architectures that leverage them. It is not possible – in my experience – to abstract away user interface volatility and treat it as a pluggable layer on top of the main architecture. End-user technologies have a way of imposing themselves deeply inside architectures. For example, necessitating an event-oriented/multi-threaded approach to data processing in order to make it possible to create responsive GUIs. Responding sychronously to data queries as opposed to batch processing.
The main takeaway is this: creating good software architectures pay dividends but they are much more likely to be significant in the parts of the architecture furthest away from the end-user interactions. i.e. inside the data modelling, inside discrete data processing components etc. They are least likely to pay dividends in areas such as GUI frameworks, client side processing models or end user application programming environments.
In fact, volatility is sometimes so intense, that it makes more sense to not spend time abstracting the end-user aspects of the architecture at all. i.e. sometimes it makes more sense to make a conscious decision to re-do the architecture if/when the next big upheaval comes on the client side and trust that large components of the back-end will remain fully applicable post-upheaval.
That way, your applications will not be as likely to be considered “dated” or “old school” in the eyes of the users, even though you are keeping much of the original back-end architecture from generation to generation.
In general, software architecture thinking time is more profitably spent in the back-end than in the front-end. There is rarely a clean line that separates these so a certain amount of volatility on the back-end is inevitable, but manageable, in compared to the volatility the will be visited upon your front-end architectures.
Volatility exists everywhere of course. For example, at the moment serverless computing models are having profound implications on "server side" architectures. Not because of end-user concerns - end-users do not know or care about these things - but because of the volatility in the economics of cloud computing.
If history is anything to go by, it could be another decade or more before something comes along like serverless computing, that profoundly impacts back-end architectures. Yet in the next decade we are likely to see dozens of major changes in client side computing. Talking to our cars, waving at our heating systems, installing apps subcutaneously etc.
Friday, April 13, 2018
Thinking about Software Architecture & Design : Part 1
This series of posts will contain some thoughts on software architecture and design. Things I have learned over the decades spent doing it to date. Things I think about but have not got good answers for. Some will be very specific - "If X happens, best to do Y straight away.", some will be philosophical "what exactly is X anyway?", some will be humorous, some tragic, some cautionary...hopefully some will be useful. Anyway, here goes...
The problem of problems
Some "problems" are not really problems at all. By this I mean that sometimes, it is simply the way a “problem” is phrased that leads you to think that the problem is real and needs to be solved. Other times, re-phrasing the problem leads to a functionally equivalent but much more easily solved problem.
Another way to think about this is to recognize that human language itself is always biased towards a particular world view (that is why translating one human language into another is so tricky. It is not a simple mapping of one world view to another).
Simply changing the language used to describe a “problem” can sometimes result in changing (but never removing!) the bias. And sometimes, this new biased position leads more readily to a solution.
I think I first came across this idea in the book "How to Solve It" by the mathematician George Poyla. Later on, I found echoes of it in the work of philosopher Ludwig Wittgenstein. He was fond of saying (at least in his early work) that there are no real philosophical problems – only puzzles – caused by human language.
Clearing away the fog of human language - says Wittgenstein - can show a problem to be not a problem at all. I also found this idea in the books of Edward de Bono whose concepts of “lateral thinking" often leverage the idea of changing the language in which a problem is couched as a way of changing view-point and finding innovative solutions.
One example De Bono gives is a problem related to a factory polluting water in a river. If you focus on the factory as a producer of dirty water, your problem is oriented around the dirty water. It is the dirty water that output that needs to be addressed. However if the factory also consumes fresh water, then the problem can be re-cast in terms of a pre-factory input problem. i.e. make the factory put its intake upstream from its water discharge downstream. Thus incentivizing the factory to not pollute the river. Looked at another way, the factory itself becomes a regulator, obviating or at least significantly reducing the need for extra entities in the regulation process.
In more recent years I have seen the same idea lurking in Buddhist philosophy in the form of our own attitudes towards a situation being a key determinant in our conceptualization of a situation as either good/bad or neutral. I sometimes like to think of software systems as "observers" of the world in this Buddhist philosophy sense. Admittedly these artificial observers are looking at the world through more restricted sense organs that humans, but they are observers none-the-less.
Designing a software architecture is essentially baking in a bias as to how "the world" is observed by a nascent software system. As architects/designers we transfer our necessarily biased conceptualization of the to-be system into code with a view to giving life to a new observer in the world - a largely autonomous software system.
Thinking long and hard about the conceptualization of the problem can pay big dividends early on in software architecture. As soon as the key abstractions take linguistic form in your head i.e. concepts start to take the form of nouns, verbs, adjectives etc., the problem statement is baked in, so to speak.
For example. Imagine a scenario where two entities, A and B, need to exchange information. Information needs to flow from A to B reliably. Does it matter if I think of A sending information to B or think of B as querying information from A? After all, the net result is the same, right? The information gets to B from A, right?
Turns out it matters a lot. The bias in the word "send" is that it carries with it the notion of physical movement. If I send you a postcard in the mail. The postcard moves. There is one postcard. It moves from 1) I have it to 2) in transit to 3) you have it (maybe).
If we try to implement this "send" in software, it can get very tricky indeed to fully emulate what happens in a real world "send" - especially if we stipulate guaranteed once and only once delivery. Digital "sends" are never actually sends. They are always replications, or normally replicate followed by delete.
If instead of this send-centric approach, we focus on B as the active party in the information flow - querying for information from A and simply re-requesting it if, for some reason it does not arrive, then we have a radically different software architecture. An architecture that is much easier to implement in many scenarios. (Compare the retry-centric architecture of many HTTP systems compared to, say, reliable message exchange protocols.)
So what happened here? We simply substituted one way of expressing the business need - a send-oriented conceptualization, with a query-oriented conceptualization, and the "problem" changed utterly before our very eyes.
Takeaway : the language in which a problem is expressed is often, already a software architecture. It may or may not be a good version 1 of the architecture to work from. It contains many assumptions. Many biases. Regardless of whether or not it is linguistic or visual.
It often pays to tease out those assumptions in order to see if a functionally equivalent re-expression of the problem is a better starting point for your software architecture.
The problem of problems
Some "problems" are not really problems at all. By this I mean that sometimes, it is simply the way a “problem” is phrased that leads you to think that the problem is real and needs to be solved. Other times, re-phrasing the problem leads to a functionally equivalent but much more easily solved problem.
Another way to think about this is to recognize that human language itself is always biased towards a particular world view (that is why translating one human language into another is so tricky. It is not a simple mapping of one world view to another).
Simply changing the language used to describe a “problem” can sometimes result in changing (but never removing!) the bias. And sometimes, this new biased position leads more readily to a solution.
I think I first came across this idea in the book "How to Solve It" by the mathematician George Poyla. Later on, I found echoes of it in the work of philosopher Ludwig Wittgenstein. He was fond of saying (at least in his early work) that there are no real philosophical problems – only puzzles – caused by human language.
Clearing away the fog of human language - says Wittgenstein - can show a problem to be not a problem at all. I also found this idea in the books of Edward de Bono whose concepts of “lateral thinking" often leverage the idea of changing the language in which a problem is couched as a way of changing view-point and finding innovative solutions.
One example De Bono gives is a problem related to a factory polluting water in a river. If you focus on the factory as a producer of dirty water, your problem is oriented around the dirty water. It is the dirty water that output that needs to be addressed. However if the factory also consumes fresh water, then the problem can be re-cast in terms of a pre-factory input problem. i.e. make the factory put its intake upstream from its water discharge downstream. Thus incentivizing the factory to not pollute the river. Looked at another way, the factory itself becomes a regulator, obviating or at least significantly reducing the need for extra entities in the regulation process.
In more recent years I have seen the same idea lurking in Buddhist philosophy in the form of our own attitudes towards a situation being a key determinant in our conceptualization of a situation as either good/bad or neutral. I sometimes like to think of software systems as "observers" of the world in this Buddhist philosophy sense. Admittedly these artificial observers are looking at the world through more restricted sense organs that humans, but they are observers none-the-less.
Designing a software architecture is essentially baking in a bias as to how "the world" is observed by a nascent software system. As architects/designers we transfer our necessarily biased conceptualization of the to-be system into code with a view to giving life to a new observer in the world - a largely autonomous software system.
Thinking long and hard about the conceptualization of the problem can pay big dividends early on in software architecture. As soon as the key abstractions take linguistic form in your head i.e. concepts start to take the form of nouns, verbs, adjectives etc., the problem statement is baked in, so to speak.
For example. Imagine a scenario where two entities, A and B, need to exchange information. Information needs to flow from A to B reliably. Does it matter if I think of A sending information to B or think of B as querying information from A? After all, the net result is the same, right? The information gets to B from A, right?
Turns out it matters a lot. The bias in the word "send" is that it carries with it the notion of physical movement. If I send you a postcard in the mail. The postcard moves. There is one postcard. It moves from 1) I have it to 2) in transit to 3) you have it (maybe).
If we try to implement this "send" in software, it can get very tricky indeed to fully emulate what happens in a real world "send" - especially if we stipulate guaranteed once and only once delivery. Digital "sends" are never actually sends. They are always replications, or normally replicate followed by delete.
If instead of this send-centric approach, we focus on B as the active party in the information flow - querying for information from A and simply re-requesting it if, for some reason it does not arrive, then we have a radically different software architecture. An architecture that is much easier to implement in many scenarios. (Compare the retry-centric architecture of many HTTP systems compared to, say, reliable message exchange protocols.)
So what happened here? We simply substituted one way of expressing the business need - a send-oriented conceptualization, with a query-oriented conceptualization, and the "problem" changed utterly before our very eyes.
Takeaway : the language in which a problem is expressed is often, already a software architecture. It may or may not be a good version 1 of the architecture to work from. It contains many assumptions. Many biases. Regardless of whether or not it is linguistic or visual.
It often pays to tease out those assumptions in order to see if a functionally equivalent re-expression of the problem is a better starting point for your software architecture.
Friday, February 23, 2018
What is a document - Part 7
Previously: What is a document? - Part 6
The word “document”
is, like the word “database”, simple on the outside and complex on
the inside.
Most of us carry around pragmatically fuzzy definitions
of these in our heads. Since the early days of personal computers
there have been software suites/bundles available that have included
distinct tools to manage “documents” and “databases”,
treating them as different types of information object. The first such package
I used was called SMART running on an IBM PC XT machine in the late
Eighties. It had a 10MB hard disk. Today, that is hardly enough to store a single document, but I digress...
I have used many other Office Suites since then, most of which have
withered on the vine in enterprise computing, with the notable
exception of Microsoft Office. I find it interesting that of the
words typically associated with office suites, namely, “database”,
“word processor”, “presentation”, and “spreadsheet” the
two that are today most tightly bound to Microsoft office are
“spreadsheet” and “presentation” to the point where “Excel”
and “Powerpoint” have become generic terms for “spreadsheet”
and “presentation” respectively. I also think it is interesting Excel has become the de-facto heart of Microsoft Office in the business community with Word/Access/Powerpoint being of secondary importance as "must haves" in office environments, but again I digress...
In trying to chip away
at the problem of defining a “document” I think it is useful to
imagine having the full Microsoft office suite at your disposal and
asking the question “when should I reach for Word instead of one of
the other icons when entering text?” The system I worked in in the Nineties, mentioned
previously in this series, required a mix of classic field-type
information along with unstructured paragraphs/tables/bulleted lists.
If I were entering that text into a computer today with Microsoft
Office at my disposal, would I reach for the word processor icon or
the database icon?
I would reach for the
Word icon. Why? Well, because there are a variety of techniques I can
use in Word to enter/tag field-type textual information and many techniques
for entering unstructured paragraphs/tables/bulleted lists. The
opposite is not true. Databases tend to excel (no pun intended) at
field-type information but be limited in their support for
unstructured paragraphs/tables/bulleted lists – often relegating
the latter to “blob” fields that are second-class citizens in the
database schema.
Moreover, these days, the tools available for
post-processing Word's .docx file format make it much easier than
ever before to extract classic “structured XML” from Word
documents but with the vital familiarity and ease of use for the
authors/editors I mentioned previously.
Are there exceptions?
Absolutely. There are always exceptions. However, if your data
structure necessarily contains a non-trivial amount of unstructured
or semi-structured textual content and if your author/edit community wants to
think about the content in document/word-processor terms, I believe
today version of Word with its docx file format is generally
speaking a much better starting point than any database front-end
or spreadsheet front-end or web-browser front-end or any structured XML editing tool front-end.
Yes, it can get messy to do the post-processing of the data but given a choice between a solution
architecture that guarantees me beautifully clean data at the
back-end but an author/edit community who hate it, versus a solution
architecture that involves extra content enrichment work at the back
end but happy author/edit users, I have learned to favor the latter every time.
Note I did not start there! I was on the opposite side of this for many,
many years, thinking that structured author/edit tools, enforcing structure at the front-end was the way to go. I built a few beautiful structured systems that
ultimately failed to thrive because the author/edit user community
wanted something that did not “beep” as they worked on content. I myself, when writing the
books I wrote for Prentice-Hall (books on SGML and XML - of all things!), I myself wanted something that did not beep!
Which brings me
(finally!), to my answer to the question “What is a document?”. My
answer is that a document is a textual information artifact where the
final structure of the artifact itself is only obvious after
it has been created/modified and thus requires an author/edit user
experience that gets out of the way of the users creative
processes until the user decides to impose structure – if they
decide to impose a structure at all.
There is no guaranteed schema validity other than
that most generic of schemas that splits text into flows, paragraphs,
words, glyphs etc and allows users to combine content and presentation as they see fit.
On top of that low level structure, anything goes – at least until the
point where the user has decided that the changes to the information
artifact are “finished”. At the point where the intellectual work has been done figuring our that the document should say and how it should say it, it is completely fine - and generally very useful - to be able to validate against higher level, semantic structures such as "chapter", "statute", "washing machine data sheet" etc.
Friday, February 09, 2018
What is a document? - part 6
Previously: What is a document? - Part 5.)
By the late Nineties, I
was knee deep in the world of XML and the world of Python, loving the
way that these two amazing tools allowed tremendous amounts of
automation to be brought to traditionally labor intensive document
processing/publishing tasks. This was boom time in electronic publishing and every new year brought with it a new output format to target: Microsoft Multimedia Viewer, Windows Help, Folio Views, Lotus Notes and a whole host of proprietary formats we worked on for clients. Back then, HTML was just another output for us to target. Little did we know back then that it would eclipse all the others.
Just about twenty years
ago now - in the fall of 1998 - , I co-presented a tutorial on XML at the International
Python Conference in Houston, Texas. [1]. At that same conference, I
presented a paper on high volume XML processing with Python [2]. Back
in those days, we had some of the biggest corpora of XML anywhere in
the world, here in Ireland. Up to the early/mid oozies, I did a lot of conference
presentations and become associated with the concept of XML
processing pipelines[3].
Then a very interesting
thing happened. We began to find ourselves working more and more in
environments where domain experts –not data taggers or software
developers – needed to create and update XML documents. Around this
time I was also writing books on markup languages for Prentice
Hall[4] and had the opportunity to put “the shoe on the other foot”
so-to-speak, and see things from an authors perspective.
It was then that I
experienced what I now consider to be a profound truth of the vast
majority of documents in the world - something that gets to the heart of what a document actually is which distinguishes it from other forms of digital information. Namely, that documents are typically
very “structured” when they are finished but are highly
unstructured when then are being created or in the midst of update
cycles.
I increasingly found
myself frustrated with XML authoring tools that would force me to
work on my document contents in a certain order and beep at me unless my documents were
“structured” at all times. I confess there were
many times when I abandoned structured editors for my own author/edit
work with XML and worked in the free-flowing world of the Emacs text editor or
in word processors with the tags plainly visible as raw text.
I began to appreciate that the ability to easily
create/update content is a requirement that must be met if the value
propositions of structured documents are to be realized, in most
cases. There is little value in a beautifully structured, immensely
powerful back-end system for processing terabytes of documents coming
in from domain experts unless said domain experts are happy to work
with the author/edit tools.
For a while, I believed
it was possible to get to something that authors would like, by customizing
the XML editing front-ends. However, I found that over and over
again, two things started happening, often in parallel. Firstly, the
document schemas became less and less structured so as to accommodate
the variations in real-world documents and also to avoid “beeping”
at authors where possible. Secondly, no amount of GUI customization seemed to be enough for the authors to feel comfortable with the XML
editors.
“Why can't it work
like Word?” was a phrase that began to pop up more and more in conversations with authors. For
quite some time, while Word's file format was not XML-based, I would
look for alternatives that would be Word-like in terms of the
end-user experience, but with file formats I could process with custom
code on the back end.
For quite a few years,
StarOffice/OpenOffice/LibreOffice fitted the bill and we have had a
lot of success with it. Moreover, it allowed for levels of
customization and degrees of business-rule validation that XML
schema-based approaches cannot touch. We learned may techniques and
tricks over the years to guide authors in the creation of structured
content without being obtrusive and interrupting their authoring flow. In
particular, we learned to think about document validation as a
function that the authors themselves have control over. They get to
decide when their content should be checked for structural and
business rules – not the software.
Fast forward to today.
Sun Microsystems is no more. OpenOffice/LibreOffice do not appear to
be gaining the traction in the enterprise that I suspected they would
a decade ago. Googles office suite – ditto. Native, browser based
document editing (such as W3C's Amaya [5])
does not appear to be getting traction either....
All
the while, the familiar domain expert/author's mantra rings in my ears
“Why can't it work like Word?”
As
of 2018, this is a more interesting question than it has ever been in
my opinion. That is where we will turn next.
Friday, January 26, 2018
What is a document? - Part 5
Previously: What is a document? - part 4.
In the early Nineties,
I found myself tasked with the development of a digital guide to
third level education in Ireland. The digital product was to be an
add-on to a book based product, created in conjunction with the author of the book. The organization of the book was very
regular. Each third level course had a set of attributes such as
entry level qualifications, duration, accrediting institution,
physical location of the campus, fees and so on. All neatly laid out, on page per course, with some free-flowing narrative at the bottom of each page. The goals of the
digital product were to allow prospective students to search based on
different criteria such as cost ranges, course duration and so on.
Step number one was
getting the information from the paper book into a computer and it is in
this innocuous sounding step that things got very interesting. The
most obvious approach - it seemed to me at the time - was to create a programmable database – in something like
Clipper (a database programming language that was very popular with
PC developers at the time). Tabular databases were perfect for 90%
of the data – the “structured” parts such as dates, numbers,
short strings of text. However, the tabular databases had no good way
of dealing with the free-flowing narrative text that accompanied each
course in the book. It had paragraphs, bulleted lists, bold/italics and underline...
An alternative approach
would be to start with a word-processor – as opposed to a database
– as it would make handling the free-flowing text (and associated
formatting, bold/italic, bulleted lists etc.) easy. However, the word
processor approach did not make it at all easy to process the
“structured” parts in the way I wanted to (in many cases, the word processors of the day stored information in encrypted formats too).
My target output was a free
viewer that came with Windows 3.1 known as Windows Help. If I could
make the content programmable, I reasoned, I could automatically generate all
sorts of different views of the data as Windows Help files and ship
the floppy disk without needing to write my own viewer. (I
know this sounds bizarre now but remember this work predated the
concept of a generic web browser by a few years!)
I felt I was facing a
major fork in the road in the project. By going with a database, some
things were going to be very easy but some very awkward. By going
with a document instead...same thing. Some things easy, some very
awkward. I trawled around in my head for something that might have
the attributes of a database AND of a document at the same time.
As
luck would have it, I had a Byte Magazine from 1992 on a shelf. It had an article by Jon Udell that
talked about SGML - Standard Generalized Markup Language. It triggered memories of a brief encounter I had had
with SGML back in Trinity College when Dr. David Abrahamson had referencing it
in his compiler design course, back in 1986. Back then, SGML was not
yet an ISO standard (it became one in 1987). I remember in those days hearing about
“tagging" and how an SGML parser could enforce structure – any
structure you liked – on text – in a similar way to programming
language parsers enforced structure on, say, Pascal source code.
I remember thinking
“surely if SGML can deal with the hierarchical structures like you
typically find in programming languages, it can deal with the
simpler, flatter structures you get in tabular databases?”. If it could, I reasoned, then surely I could get the best of both worlds. My own data format that had what I needed from database-approaches but also what I needed from document approaches to data modelling?
I found – somehow
(this is all pre-internet remember. No Googling for me in those days.) – an address in Switzerland that I could
send some money to in the form of a money order, to get a 3.5 inch floppy back by return
post, with an SGML parser on it called ArcSGML. I also found out about
an upcoming gathering in Switzerland of SGML enthusiasts. A colleague, Neville Bagnall
went over and came back with all sorts of invaluable information
about this new thing (to us) called generalized markup.
We set to work in earnest. We
created our first ever SGML data model. Used ArcSGML to ensure we
were getting the structure and consistency we wanted in our source data. We set about
inventing tags for things like “paragraph”, “bold”,
“cross-reference” as well as the simpler field-like tags such as
“location”, “duration” etc. We sent about looking at ways to
process the resultant SGML file. The output from ArcSGML was not very
useful for processing, but we soon found out about another SGML parser called SGMLS
by Englishman James Clark. We got our hands on it and having taken one look at
the ESIS format it produced, we fell in love with it. Now we had a
tool that could validate the structure of our document/database and
feed us a clean stream of data to process downstream in our own software.
Back then C++ was our
weapon of choice. Over time our code turned into a toolkit of SGML
processing components called IDM (Intelligent Document Manager) which
we applied to numerous projects in what became known as the
“electronic publishing era”. Things changed very rapidly in those
days. The floppy disks gave way to the CD-ROMs. We transitioned from
Windows Help files to another Microsoft product called Microsoft
Multimedia Viewer. Soon the number of “viewers” for electronic
books exploded and we were working on Windows Help, Multimedia
Viewer, Folio Views, Lotus Notes to name but four.
As the number of
distinct outputs we needed to generate grew, so too did the value of our investment getting
up to speed with SGML. We could maintain a single source of content
but generate multiple output formats from it, each leveraging the
capabilities of the target viewer in a way that made them look and
feel like they had been authored directly in each tool as opposed to
programmatically generated for them.
My concept of a
“document” changed completely over this period. I began to see
how formatting – and content – could be separated from each other. I began to see how in so
doing, a single data model could be used to manage content that is
tabular (like a classic tabular database) as well as content that is
irregular, hierarchical, even recursive. Moreover, I could see how
keeping the formatting out of the core content made it possible to
generate a variety of different formatting “views” of the same
content.
It would be many years
later that the limitations of this approach became apparent to me.
Back then, I thought it was a completely free lunch. I was a fully paid-up convert to the concept of
generalized markup and machine readable, machine validatable
documents. As luck would have it, this coincided with the emergence
of a significant market for SGML and SGML technologies. Soon I was
knee deep in SGML parsers, SGML programming languages, authoring systems,
storage systems and was developing more and more of our own tools,
first in C++, then Perl, then Python.
The next big transition
in my thinking about documents came when I needed to factor
non-technical authors into my thinking. This is where I will turn
next. What is a document? - Part 6.
Monday, January 15, 2018
What is a document? - part 4
Previous: What is a document - Part 3.
In the late Eighties, I
had access to an IBM PC XT machine that had Wordperfect 5.1[1]
installed on it. Wordperfect was both
intimidating and powerful. Intimidating because when it booted, it
completely cleared the PC screen and unless you knew the function keys
(or had the sought-after function key overlay [2]) you were left to
you own devices to figure out how to use it.
It was also very
powerful for its day. It could wrap words automatically (a big deal!). It could redline/strikeout text which made
it very popular with lawyers working with contracts. It could also
split its screen in two, giving you a normal view of the document on
top and a so-called “reveal codes” view on the bottom. In the
“reveal codes” area you could see the tags/markers used for
formatting the text. Not only that, but you could choose to modify
the text/formatting from either window.
This idea that a
document could have two “faces” so to speak and that you could
move between them made a lasting impression on me. Every other
DOS-based word processor I came across seemed to me to be variations
on the themes I had first seen in Wordperfect e.g. Wordstar,
Multimate and later Microsoft Word for DOS. I was aware of the
existence of IBM Displaywriter but did not have access to it. (The
significance of IBM in all this document technology stuff only became
apparent to me later.)
The next big "aha moment" for me came with the arrival of a plug-in board for IBM PCs
called the Hercules Graphics Card[3]. Using this card in conjunction
with the Ventura Publisher[4] on DRI's GEM graphics environment [5] dramatically expanded
the extent to which documents could be formatted - both on screen an on the resultant paper. Multiple fonts,
multiple columns, complex tables, equations etc. Furthermore, the
on-screen representation mirrored the final printed output closely in
what is now universally known as WYSIWYG.
Shortly after that, I
found myself with access to an Apple Lisa [6] and then an Apple Fat
Mac 512 with Aldus (later Adobe) Pagemaker [7] and an Apple
Laserwriter[8]. My personal computing world split into two.
Databases, spreadsheets etc. revolved around IBM PCs and PC
compatibles such as Compaq, Apricot etc. Document processing and
Desktop Publishing revolved around Apple Macs and Laser Printers.
I became
intoxicated/obsessed with the notion that the formatting of documents
could be pushed further and further by adding more and more powerful
markup into the text. I got myself a copy of The Postscript Language Tutorial and
Cookbook by Adobe[9] and started to write Postscript programs by
hand.
I found that the
original Apple Laserwriter had a 25 pin RS/232 port. I had access to
an Altos multi-terminal machine [10]. It had some text-only
applications on it. A spreadsheet from Microsoft called – wait for
it – Multiplan (long before Excel) – running on a variant of –
again, wait for it – Unix call Microsoft Xenix [11].
Well, I soldered up a serial
cable that allowed me to connect the Altos terminal directly to the
Apple Laserwriter. I found I could literally type in Postscript
command at the terminal window and get pages to print out. I could
make the Apple Laserwriter do things that I could not make it do via Aldus
Pagemaker by taking directly to its Postscript engine.
Looking back on it now,
this was as far down the rabbit hole of “documents as computer
programs” that I ever went. Later I would discover TeX and find it
in many ways easier to work with than programming Postscript directly. My career
started to take me into computer graphics rather than document
publishing. For a few years I was much more concerned with Bezier
Curves and Bitblits[12] using a Texas Instruments TMS 34010[13] to
generate realtime displays of financial futures time-series analysis (A field known as technical analysis in the world of financial trading
[14]).
It would be some years
before I came back to the world of documents and when I did, my
approach route back, caused me to revisit my “documents as programs”
world view from the ground up.
It all started with a
database program for the PC called dBase by Ashton Tate[15]. Starting from the perspective of a database made all the difference to my world view. More on
that, next time.
Tuesday, January 02, 2018
What is a document? - Part 3
Previously : What is a document? - part 2.
Back in 1983, I
interacted with computers in three main ways. First, I had access to
a cantankerous digital logic board [1] which allowed me to play around with boolean logic
via physical wires and switches.
Second I had access to a Rockwell 6502
machine with 1k of RAM (that's 1 kilobyte) which had a callous-forming keyboard and a single line (not single monitor –
single line) LED display called an Aim 65[2]. Third, at home I had a
Sinclair ZX80 [3] which I could hook up to a black and white TV set and get a whopping
256 x 192 pixel display.
Back then, I had a
fascination with the idea of printing stuff out from a computer. An early indication – that I
completely blanked on at the time – that I was genetically predisposed to an
interest in typesetting/publishing. The Aim 65 printed to a cash
register roll which was not terribly exciting (another early indicator that I
blanked on at the time). The ZX80 did not have a printer at all...home printing was not a thing back in 1984. In 1984 however, the Powers That Be in TCD gave us second year computer science newbies rationed access to a Vax 11/870, with glorious Adm3a[4] terminals.
In a small basement
terminal room on Pearst St, in Dublin, there was a clutch of these terminals and we would eagerly stumble down the stairs at the appointed times, to get at them. Beating time in
the corner of that terminal room, most days, was a huge, noisy dot matrix printer[5],
endlessly chewing boxes of green/white striped continuous computer
paper. I would stare at it as it worked. In particular, finding it
particularly fascinating that it could create bold text by the clever
trick of backing up the print head and re-doing text with a fresh
layer of ink.
We had access to a
basic e-mail system on the Vax. One day, I received an e-mail from a
classmate (sender lost in the mists of time) in which one of the
words was drawn to the screen twice in quick succession as the text
scrolled on the screen (these were 300 baud terminals - the text appeared character by character, line by line, from top to bottom). Fascinated by this, I printed out the e-mail, and found that
the twice-drawn word ended up in bold on paper.
"What magic is this?", I thought. By looking
under the hood of the text file, I found that the highlighted word – I believe it was the word “party” – came out in bold because five control characters
(Control-H [5] characters[6]) had been placed right after the word. When displayed
on screen, the ADM3a terminal drew the word, then backed up 5 spaces because of the Control-H's,
then drew the word again. When printed, the printer did the same but
because ink is cumulative, the word came out in bold. Ha!
Looking back on it,
this was the moment when it occurred to me that text files could be
more that simply text. They could also include instructions
and these instructions could do all sorts of interesting
things to a document when it was printed/displayed...As luck would have it, I also had access to a wide-carriage
Epson FX80[7] dot matrix printer through a part-time programming job
I had while in college.
Taking the Number 51 bus to
college from Clondalkin in the mornings, I read the Epson FX-80 manual from cover to cover. Armed
with a photocopy of the “escape codes”[8] page, I was soon a dab hand
at getting text to print out in bold, condensed, strike-through,
different font sizes...
After a while, my Epson FX-80 explorations ran out of steam. I basically ran out of codes to play with. There
was a finite set of them to choose from. Also, it became very
apparent to me that littering my text files with these codes was an
ugly and error prone way to get nice print outs. I began to search for a better way. The “better way”
for me had two related parts. By day, on the Vax 11/780 I found out
about a program called Runoff[9]. And by night I found out about a
word-processor called Worstar[10].
Using Runoff, I did not have to embed, say, Epson FX80 codes into my text files, I could embed
more abstract commands that the program would then translate to
printer-specific commands when needed. I remember using “.br” to create a
line break (ring any bells, HTML people?). “.bp” began a new
page, “.ad” right-aligned text. etc.
Using Wordstar on an
Apple II machine running CP/M (I forgot to mention I had access to
one of them also...I wrote my first ever spreadsheet in Visicalc on
this machine, but that is another story.) I could so something
similar. I could add in control codes for formatting and it would
translate for the current printer as required.
So far, everything I
was using to mess around with documents was based on visible coding systems. i.e. the coding added to the
documents was always visible on the screen interspersed with the
text. So far also, the codes
added to the documents where all control codes. i.e. imperative
instructions about how a document should be formatted.
The
significance of this fact only became clear to me later but before we
get there, I need to say a few words about my early time with
Wordperfect on an IBM PC XT. My first encounter with a pixel-based
user interface – it was called GEM [11] and ran on top of DOS on
IBM PCs. An early desktop publishing system called Ventura Publisher
from Ventura Software which ran on GEM. I also need to say a little
about the hernia-generating Apple Lisa[12] that I once had to carry up a spiral stair-case.
Oh, and the mind blowing
moment I first used Aldus Pagemaker[13] on a Fat Mac 512k[14] to
produce a two columned sales brochure on an Apple Laserwriter[15] and
discovered the joys of Postscript.
Next : What is a document? - Part 4.
Next : What is a document? - Part 4.
[1] Similar to this
http://microtechcorporation.net/mtc/wp-content/uploads/2016/07/scmes7.jpg
[5] Similar to this
http://bit.ly/2CFhue9
Subscribe to:
Posts (Atom)