Answering “How Long Will This Take?” and Other Impossible Questions

As we decide where to focus, figure out the best way to implement things, or evaluate costs on any project, we hear the same questions: “How much will this cost?” Or “When will it be done?” Or even technical questions like, “How many records can this process per second?”

We need to make decisions and decide where to invest, so these are critical questions. But that doesn’t make them easy. It seems like there are always too many variables to even begin finding an answer.

A Counter-Question

One technique that’s helped me find some traction is to turn the question around. Rather than asking, “How long will it take?” I instead ask, “If it took two months, would that be too long?”

This ends up being a useful prompt for several reasons. First, having a constraint can be helpful. It keeps the discussion grounded: rather than imagining all the amazing things you could build, you immediately refocus on things that fit in a given size. You can now talk about whether or not those things will work.

Second, picking a reference point makes it much easier to make a go/no-go decision. Almost any cost or schedule you can imagine will fall into one of three categories: too big, totally doable, or maybe. This is a vast improvement over, “I have no idea.”

A Real-Life Example

Here’s a recent example from my current project: We had a slow query and had to decide how much to invest in fixing it. There were many possibilities, ranging from adding an index all the way to things like bloom filters, CDNs, and distributed caches. We wondered how far we should go.

At this point in the project, we don’t really know what the usage pattern is, but we were still able to make useful estimates. For example, the customer’s network has roughly 50,000 users. We know they won’t all be using the new software, so we restricted that to “maybe half.” We also don’t know how often they need to use it, but we estimated maybe 50 postings per year.

One thing was certain: These numbers are wrong. We don’t know whether we’re high or low, but we know they’re wrong. That doesn’t mean they’re not useful, though. We know we’ve got data growth on the order of magnitude of a million records per year. Or maybe two million. Or only half a million.

We also know it’s much less than 100 million. And much more than just 100. This helped bound the problem so we could discuss what sort of investment makes sense today.

Based on these numbers, I know that things will grow quickly enough that they’ll need some attention soon after deploy, but it’ll be years before the data grows beyond “normal” tooling.

There are always questions you aren’t going to be able to answer. But even if you have many unknown variables, plugging in a made-up number can help you avoid analysis paralysis. Having some idea of the very rough bounds of your problem can help you start to make progress toward your goal.