Saturday, March 5, 2022

Piecewise Construction

One of the popular ways to build software is to start by focusing on a smaller piece of a larger problem. You isolate that piece, code it up, and then release it into the wild for people to use.

While that might seem like the most obvious way to gradually build up large systems, it is actually probably the worst way to do it.

There are two basic problems. The first is that you end up building the same stuff over and over again, resulting in a lot of extra and unnecessary work. The second is that people usually choose the easiest pieces as the starting point, leaving the bigger and often more important problems to possibly never get solved or to show up way too late.

If you look at a big system and follow the code all the way down to its persistence based on features, you will find that for similar entry points, the differences are small. A gui screen that shows a list of users, for example, is likely 90% similar to one that shows some other domain data. It’s all basically all the same. The code differs at the top, in the way the screen is wired, and at the bottom, in the schema, but the rest of it is doing the exact same thing as at least one other piece of code in the system. The same is true for any data import or data export code. There are variations needed for each but if you isolate them, they are always small.

Modern software construction is all about rushing through the work, and the fastest way most people think to get a new piece constructed is to ignore most of what is already there. Writing new code is far faster than reading old code. Reading old code is hard. High turnover in development projects just makes that worse. So, we see that the pieces are often written as new detached silos, one after another.

The problem with that is that the pieces are rarely completely independent at a domain and/or technical level, so the newer pieces will need to do similar things to existing pieces, but slightly differently, which causes a lot of issues. These mismatches are sometimes very difficult to find because the pieces can all have different styles, conventions, idioms, and even paradigms. To really spot the problem would require a deep understanding of at least two pieces, if not more. But that means reading even more code. So, people usually just apply junk code to patch it in a few places, instead of fixing it.

If you can accept that say 90% of the code is duplicated in at least one other place in the codebase, the consequence of this is that at minimum, you are doing at least 10x more work to get the system built. If the redundancy is greater, then the excess work is far higher. On top of that, testing is proportional, so if you write 10x more code, you should do 10x more testing. The less you test, the more bugs get released into the wild, and when these blow up they will disrupt working on the next round of pieces. So it is getting worse, not better.

There is a simple and obvious measure to indicate when this is a significant problem with an existing development project. If the construction technique is extending the system, then later additions will get easier to accomplish. If it is piecewise, the ramp-up time for new programmers will be tiny, but each and every new piece will take longer and longer to get done. Partly because of the redundancies, but also because of the increasing disruptions from bugs already leaked out.

For programmers, you can tell which type of project is which when you start working on that codebase, but for management, they can not use the opinion of any new programmers to correctly inform them because they can’t tell if it really is a mess of piecewise construction, or if the programmer just doesn’t like to read other programmer’s code. So, it's better for management to gauge the progress using programmers with longer experience on the project. Given some work similar to earlier work, does it take more or less time to get that accomplished? Is that trend changing? Because it is difficult to do that in practice, it usually is avoided too.

The other problem is usually worse. Most systems these days are attempted rewrites of earlier systems that are often using older technologies. Those earlier systems though, are often rewrites of preceding systems themselves.

Going at a piecewise rewrite means tackling the easy and obvious pieces first, then gradually getting to at least the scope of the earlier project. But mid-way through, in order to justify the expenditure, the rewrite faces a lot of pressure to go live. It does so with a reduced scope from the earlier system, and many of the hardest problems are still left to be dealt with. And because it is piecewise, the work is getting harder and messier with each new release.

So, it is set up for failure even before the work has started.

The longer trend is that over generations of systems, the hard problems tend to keep getting ignored and they grow worse. Often they become packaged up into different “systems” themselves, and so the problems repeat themselves. Suddenly there are lots and lots of systems, each getting more trivial, most of which are incomplete and the lines between them are not there for sane architectural reasons, but rather as an accident of the means of construction.

It’s odd in that at the same time the availability of frameworks and libraries should have meant less time and better results. But that is rarely the case. When programmers had to write more of the codebase themselves, the systems were obviously cruder, but they had to manage their time better. They learned strong skills to help them avoid redundancies. A release cycle would be measured in years, but as that decreased to months and then to weeks, piecewise coding tended to dominate. You can pound out a new low-quality screen in a couple of weeks if you just redo everything, but that code won’t mesh well with the rest of the system, and most of that time was actually wasted. So, with constant short releases, it is far less risky to just use brute force and not worry about the longer-term problems, even if it is obvious that they are getting worse.

Realistically, the faster a system comes out of the gate, as it gets going initially, the more likely that the code will be assembled with piecewise construction. This was supposed to be better than the older, slower, more thoughtful ways that sometimes got stuck in analysis paralysis or even released stuff that deviated too far from being useful, but ultimately because it is just another extreme the results aren’t any better. An endless cycle of rewriting earlier failed piecewise construction projects with more piecewise construction is just doing the same thing over and over again but expecting the results to change.

1 comment:

  1. It would have been great to include more discussion of the alternatives to piecewise construction and how you think it should be done effectively. One factor I've seen that can push developers towards costly piecewise construction is when a team is put under pressure to deliver a bunch of new related features, and in the absence of shared fundamental data structures (structs/classes/types/schema) everybody goes off on their own and implements the same things in slightly different ways. If there's good rapport on the team (and/or effective management), there will be frequent rationalisation steps where we merge our duplicated work and simplify it. If not, then we'll just carry on building stuff on top of an arbitrarily redundant hierarchy of structures, despite the cost of navigating, understanding and fixing things increasing on a daily basis. I've seen that effect just a couple of times over the years, where you can practically sense the overhead and see the decrease in productivity (alongside an increase in working hours to try to compensate). It rarely ends well.

    ReplyDelete

Thanks for the Feedback!