Forecasting: do we really need faster horses?

This week an RFP landed in my mailbox, the likes of which I have seen countless times before. It was such a clear example of both the flaws in the typical RFP process and the flaws of the traditional statistical forecasting engines that I decided to write about it. It was an RFP for a demand forecasting system.

For those who do not know, RFP is the acronym for Request For Proposal. In the selection of an enterprise software system, a company will go through an extensive process of determining what their needs are and then selecting a vendor that can address those needs in a software solution.

There is usually a very clear boundary between those two steps, and the RFP is the result of the first and kick-starts the second by sending it to a broad range of vendors in the form of a questionnaire. The problem with the RFP approach is twofold: both the form and process are flawed. The form is one of a very long list of feature-functions that the vendor should respond to item-by-item, whether their system provides this or not.

In the best case, the process does not involve vendors until after the feature-function list is completed; in the worst case it allows one vendor to stack the deck in their favor by setting requirements that only they can provide in the exact form stated.

In this latest RFP, the prospect is trying to replace SAP APO since its forecast accuracy is so awful that it is hurting their business. The astonishing thing is that their RFP reads like a technical description of APO. They are asking for the same flawed functionality they are trying to get away from, just branded differently! Immediately, a famous quote, frequently attributed to Henry Ford, came to mind:

"If I had asked people what they wanted, they would have said faster horses."

Until people knew of the concept of an automobile, a faster horse and carriage would have been the best they could conceive to aim for. There is a time and place to ignore the customer and build what you think is best.

That time is when you are introducing a revolutionary game-changing innovation, such as the automobile (or in Henry Ford's case, a mass-produced automobile accessible to the mainstream consumer) or maybe an iPad. Supply Chain Planning Brief

The demand forecasting system market is by now very mature. Most forecasting systems have grown over the decades and plugged functional gaps that had previously differentiated their competitors, to the point where there is no real differentiator left. In my honest opinion, these systems became commodities perhaps as far back as two decades ago.

Even so, most companies will readily accept that the forecast accuracy these systems provide them is sorely inadequate. And most academics are quick to point out that forecasting has a theoretical ceiling above which further accuracy improvement is not possible. This is tantamount to claiming that you cannot travel faster because horses can simply not run faster than they do.

What is required to advance the field of demand forecasting is to think outside the box.

And what is required to get a more accurate demand forecast for your business is to open your selection processto innovation.

On the last count, the RFP is certainly not the vehicle to do so. It is a case of the customer requiring the car manufacturer to use a special kind of horse shoe, when clearly the car has surpassed the limitations of horse shoes. If the car manufacturer answers that they do not need horse shoes, then they are disqualified from the selection process. The customer just shot themselves in the foot by excluding the very provider that could solve their problems better and more economically. This is unfortunately the current state in demand forecasting systems, and has been for at least well over a decade. The only real improvement to be expected is through true innovation.

On the first count, what is the box in demand forecasting that constrains everyone's thinking? What are the selfimposed bonds that nobody seems willing or able to break? Let me list out the four main ones:

Throwing ever more forecasting algorithms at the problem, then trying to fit the best one to each demand series.

Requiring manual adjustments to demand quantities to improve the forecast

Confusing forecasting with predicting: outputting exact numbers rather than ranges of values each with a probability of occurring.

Point 1 is the root cause of most issues of forecasting systems. Yes, the academics are correct that IF this is the approach THEN you will hit a theoretical ceiling of maximum achievable accuracy pretty quickly. Except they never state the "IF" part, since they are themselves unaware that their thinking is limited by this assumption. It should be no surprise that demand modeling engines which are not based on this paradigm will consistently and significantly outperform the theoretical ceiling. It is not just that the forecasting algorithm is different, the INPUT is different. To break through the glass ceiling different inputs are needed, in different granularity and of various different types.

Point 2 is a case of battling symptoms. A vendor or academic community realizes that an algorithm is flawed, so comes up with a new one. This new one is sometimes better, sometimes worse than the original, and neither is very good in quite a large range of scenarios. So more and more algorithms are created. Surprisingly, none of these new algorithms are fundamentally different from their flawed predecessors. Yet no one acknowledges the elephant in the room and questions whether the general approach, and its basic premises, are valid. To make things worse, they then use a fundamentally flawed approach (known as the "expert system" or "expert selection") to pick between outputs of fundamentally flawed forecasting algorithms.

Point 3 is another Band-Aid. The automatically generated forecast is not accurate enough, so subject matter experts need to make adjustments to the forecast. There are two main problems with this approach (other than again ignoring the root cause). The first is that this process introduces significant bias into the forecast. The automated forecast may have been very wrong, but at least it will usually have about zero bias. Second, this approach never allows the automated forecast to ever get fundamentally better, since it can never learn from past mistakes.It may adjust a baseline history and a baseline forecast to both historic and future adjustments, but has no means of determining why and so can never improve predictions of the future. This means that forecast accuracy will not be improved incrementally over time and it means forecasting will always be a highly manual, and thus a non-scalable process.

Point 4 has many negative consequences. The most important in my opinion is that exact numbers give a false reassurance about the reliability of the forecast. This is somewhat recognized in conversations and meetings where the forecast error may be discussed, but the actual planning calculations performed by planning systems take the single value forecast as the absolute truth with total disregard for the uncertainty around the forecast. It is these consuming systems that even when they generate feasible or even optimal plans on paper, in reality produce plansthat are not even executable by the time they are published. Of course, the blame for that is not solely with the forecasting system providing garbage input, since most planning systems are not capable of accepting stochastic input even if it was made available. But getting a real forecast, not a mere prediction, would be a very important first step.The combined effect is that the traditional multi-algorithm forecasting engines generate horrible forecast accuracy and require large manual effort to achieve even that. As a reference, against APO our demand modeling engine usually reduces forecast error by at least 50%, but frequently in the range of 80% to 90%. And although APO seems to be the worst, it is certainly not in a league of its own. The other multi-algorithm engines seem to be only marginally better, judging by how much improvement can be made. For some reason, in the situations where humans heavily adjust the forecast, the gap seems to be larger, not smaller. I am not sure if this is caused by more adjustments occurring in those environments where forecasting is more difficult, or due to humans focusing only on a few key items and neglecting all others. We have not consistently measured the forecast value-add in those benchmarks to be conclusive. Either way, the huge gap cannot be explained by simply being better at doing the same thing. It is only explained by doing it completely differently, which in turn proves the existence of a fundamental flaw inherent in the multi-algorithm approach.

Circling back to the RFP that triggered this piece, out of some 200 questions, almost 150 were dealing with multiple algorithms and the enormous manual efforts imposed on planners due to these multiple algorithms. How does one respond that you truly do not need all those manual functionalities, without being disqualified? Shouldn't the customer instead be determining how much value they can drive with a new system, rather than whether you can mirror the processes forced onto them by the limitations of the old legacy systems?

Instead, the customer is left with a list favoring an older inferior system. This is neither in the best interest of the customer, nor of the industry as a whole, since it stymies innovation.