The Programmer's Paradox: Machinery

Over the years, I’ve had a lot of conversations about design, architecture, and the uniqueness of higher-level approaches to structuring large codebases. I’ve put together a synthesis of those discussions, but it is probably unreadable by most people.

One way of looking at code is as if some of it was a small independent machine. You feed some stuff into the machine, a computation whirls along, and then some stuff comes out the other side.

That holds together in the sense that Turing machines are finite discrete formal systems by definition, so we can take larger behaviors and partition them down into a set of finite pieces, each of which is discrete in its behavior. While the halting problem is an infinite behavior, it is actually tied to an infinitely long tape. Place a fixed bound on that tape size, the resulting mechanics is then itself finite and the halting problem goes away.

In a practical sense, that doesn’t matter, we don’t want fixed limits on some types of computations but those exceptions are rare enough, such as graphical user interfaces, or certain types of searching, or command input, that we can separate out the infinite computations from the finite ones.

With that distinction, we can alter our perception of a large computer system as mostly being an interconnected network of these finite computations (sometimes wrapped by an infinite computation), and with that viewpoint, we can see the computations themselves as black boxes.

So, we have some code that, as a machine, produces a specific output. We can build on that by having more code that uses a bunch of these machines to compute a slightly more complex output. With this, we can see the whole system as a fairly large series of machines, some decomposable, some not. If we wanted to create a new system with lots of different high-level machines that calculate related, but different outputs, it’s entirely possible to break this down into a list of all of the underlying machinery. There would be multiple ways of decomposing the higher-level problems, but there would always be some near absolute minimum size for the overall collection and each machine within it.

While that may first appear to be pointless, what it tells us is that it is possible to reduce the work needed to build the system. More useful is that recreating the same machine but having it underlying in different higher-level machines is an obvious waste of effort. More to the point, there is some ‘natural’ decomposition of the system itself, that minimizes the overall composite machinery.

But that’s not exactly true. We might have two machines that take the same input and return the same output, but they differ internally with the way they make tradeoffs such as time vs space. It’s multi-dimensional, so there is an infinite number of projections down to just one.

So, if we stop just above the internals of each machine, and the computations are encapsulated away from infinite tapes, it does seem as if we can specify a rather exact minimum. From a practical perspective, it doesn’t really matter if this is correct or not, but what does matter is that we can identify the lower-level machines, and while working avoid duplicating them. And above that, we can identify the next level and avoid duplication there as well. We can do this at each level, for all compositions in order to ensure that the work we are doing is necessary and not wasted.

The really big question is how to find such a decomposition. It turns out that the inputs and outputs to each machine are a huge clue. It’s easy to see that in most systems, the different data types utilized are finite. There are a fixed number of primitive types, there is a fairly fixed number of structural arrangements, also known as data structures. So what distinguishes the machines from each other is the permutations of the inputs and outputs, and these belong to a fixed set. Then, it would be easy to see if a machine is currently missing, but it is also easy to see if one machine is a proper subset of another. The third case is an intersection between two, but we know that at worst we could treat that as three machines. That of course applies all of the way up the hierarchy of machine composition, so if we utilize that understanding we have a means of normalizing the machinery.

Now that does open the door so that when we add new machines, we might have to renormalize the whole set again, but it seems more likely that the worst case there is restricted to a specific branch or subtree of the hierarchy. In order to avoid that as a recurring problem, when we add new inputs or outputs, we make sure the machines that utilize them are complete. That is, additions at each level always includes a bunch of new machines, even if they aren’t being used yet. That then attempts to minimize the growth at different points of time along with minimizing the total underlying work. Adding in that expectation that the work will always continue to expand, adds a bit more to each iteration, but eventually gets it taken back from fewer renormalizations.

So, we have some idea of the minimum amount of work, that is above the tradeoffs, and some idea of being able to minimize the work across the entire lifespan. We can even partition different parts of this hierarchy, bounded by a set of levels, from each other so that we can overlay a structure on top of this with specific properties. That can be understood as an ‘architecture’, and those properties can be organizational, or specifically related to behavior such as security or comprehension.

So seeing the system as a finite set of machines leads to ways to optimize parts of its construction and it also helps to place bounds on the work needed to assure that all of the parts work together as specified. What’s left undefined in this perspective is the tradeoffs made inside of the machines, and any higher level performance properties assigned at different levels.

The Programmer's Paradox

Sunday, January 24, 2021

Machinery

No comments:

Post a Comment