John Boyd, The OODA Loop, and Near Real-Time Analytics

The following is used by permission and first appeared on the Teradata EMEA Blog. TeraData is a TIBCO partner and leading provider of analytic data platforms, applications, and services to results-oriented IT and business leaders who seek smarter, faster and more informed answers to their most essential questions.

You have probably never heard of the late John Boyd, although as Robert Coram’s recent book explains, he was—and continues to be—one of the most influential military strategists of our time.

Before his ascent to Clausewitzean greatness, Boyd was a fighter pilot. I always wanted to be a fighter pilot—until I worked out that my eyesight was too poor and my reflexes too slow to make the grade—and so I am a voracious consumer of aviation literature. But even if you didn’t grow up dreaming of strapping in to a fast jet, I hope to persuade you that you should pay some attention to Boyd.

Fighter pilots were once mostly an impulsive, happy-go-lucky bunch, not much inclined to the study of their craft. Boyd was arguably the first fighter pilot of the jet age to systematically analyse and describe Air Combat Manoeuvring (ACM)—what in a bygone and perhaps more romantic age we used to call “dog-fighting.”

As significant as this early work was to the development of modern aerial warfare, had it been the sum total of Boyd’s contribution to military theory he would have only footnotes to his name, not a whole book. Instead, Boyd went on to develop the Energy-Manoeuvrability (E-M) theory of aerial combat. The application of E-M theory revolutionized the design of fighter aircraft in much the same way that Boyd’s earlier work had revolutionized ACM tactics, by emphasising “agility” over considerations of “bigger-higher-faster-farther.” The F-15 and F-16 aircraft, in particular, look and fly the way they do because of John Boyd’s work.

E-M theory quantified and explained why the USAF and USN had had such a tough time in the skies over Vietnam. (Stay with me here—we’ll leave aviation history, conflict, and grisly military metaphor behind us soon.) U.S. military aircraft had become progressively bigger, heavier, and less manoeuvrable since the late 1940s. If the F-86 Sabre that the USAF fought the Korean War with was a racing car, the F-4 Phantom and most of the other jets that the USAF and USN fought the Vietnam War with were monster trucks. In the Korean War, the USAF had (or at any rate believed that it had) an “exchange ratio”—the delicate military euphemism for “kill rate”—of 10:1. In 1967, towards the height of the war in Vietnam, the USAF exchange ratio fell below parity. As Coram tells it: “When the war finally ended, one Air Force pilot would be an ace [a fighter pilot with five confirmed aerial kills]. North Vietnam would have 16.”

As impressive an achievement as E-M theory was and is, Boyd was never one to rest on his laurels; he realized that conversely it could not adequately explain why the USAF had apparently scored so well in the Korean War. Beautiful and capable aircraft though it undoubtedly was, E-M theory demonstrated that—aerodynamically at least—the F-86 Sabre was slightly inferior to its principal adversary during the Korean conflict, the Mig-15. Not only that, but many of the Mig-15 drivers had been highly trained and experienced pilots. How was it then that the USAF and the F-86 Sabre had come to so totally dominate the skies over Korea? It was this question that led Boyd to his greatest intellectual achievement: the OODA loop. OODA in an acronym for: Observe; Orient; Decide; Act.

Initially, at least, Boyd was mostly concerned with the question of how fighter pilots make decisions in combat. But his theory is increasingly being adopted as a model of how we should approach decision making in general, particularly in competitive situations. And Gartner’s Roy Schulte is advocating the OODA loop as a model for how we should think about near real-time decision making.

Boyd’s model for human decision-making is that we observe that something has happened; place that observation in the context of the other things we know about the situation; decide on a course of action; and then act on it. The process is a loop because we continue to gather new —and make new decisions—as the situation unfolds.

A discussion of just why this model has had such a profound impact on military strategy—and is increasingly influencing business strategy—is beyond the scope of this blog (and the Coram book is excellent and I highly recommend it). But for completeness, let’s briefly return to the skies over Korea and the fortunately only-occasionally-hot Cold War. The Mig-15’s poor visibility and heavy controls meant that its luckless pilots literally couldn’t see trouble coming—and couldn’t get out of its way quickly enough, even when they had realized that was what they needed to do. Two critical design flaws had robbed Mig-15 pilots of the ability to observe and to act, so that the F-86 pilots could get inside the decision cycles of their adversaries. And when the other guy OODAs faster than you can OODA, the game is up.

The reason I agree with Schulte that the OODA loop is an interesting model for Enterprise Analytics is that I think it may give us a way to classify the different types of near real-time analytics that are commonly employed in organizations today. We talk a lot about “low latency analytics,” “near real-time analytics,” “business activity monitoring,” and “operational intelligence,” but we use these labels interchangeably to describe use cases as varied as inbound marketing in retail finance and preventative maintenance in transportation. Can we take a leaf out of Boyd’s book and more systematically analyze and describe the field?

I am no John Boyd, but let’s try. Before we do, let’s establish a guiding principle: We want to do something as quickly as possible—and remember, it may not be enough to OODA, we may need to OODA faster than the other guy; and we should push as much of the decision making process “upstream” to a point as close to the event and action as we can. So, if we can use a real-time system like a Complex Event Processing (CEP) platform, we should; and we will use a “downstream” system like the Data Warehouse only where we need to in order to be sure that we make the best decision that we can, or to detect the otherwise undetectable.

If we need to make a decision in times measured in microseconds and even nanoseconds, then we certainly don’t have time to involve a human being—and even the fastest Analytical RDBMS (Teradata, obviously!) can’t help. We are in real-time territory here; we will need to define rules that define how the system should automatically respond to a range of input conditions (observations and events); and the system itself will be based either on special-purpose hardware (if we need to be really quick) and/or a CEP platform. If we are smart, we will still capture the events, decisions, and outcomes in the Data Warehouse for subsequent analysis to enhance our business rules. But any analysis we do there will take place some time after the fact.

At the other end of the spectrum, we have situations where we want a human actor to make the final decision. Possibly the prototypical example is the too-many-passengers-not-enough-seats-on-the-plane use case. If there are six of us at the gate and only five seats on the plane, the system should certainly make a recommendation about who gets left behind—but the gate agent should probably make the final decision, based at least in part on intangibles and information likely to be available only to him or her. If you have more frequent flyer miles than me, but I am diabetic and am running low on insulin, then my medical condition should probably trump your gold card. We are now in near real-time territory; the system recommendation will need to be based on analysis of who has the most frequent flyer miles, who has the highest projected lifetime value, which of us has already and/or most recently been failed by the airline, etc. This means that the system will be dependant on the detailed historical data found in the Integrated Data Warehouse, even though the event that we are reacting to may well be detected externally. And because even the ablest humans are generally slower than a well-engineered computer system, it is likely that the human actor will be the rate-limiting factor, rather than the technology.

And then we have scenarios that fall between these two extremes. We have use cases—for example, web-site personalization—where both the observation (cCstomer XYZ is returning to the website) and the orientation (Customer XYZ has recently been browsing holidays to Maui) can take place upstream (and so should take place upstream, according to our guiding principle), but where we may need to fetch a pre-computed next best action (perhaps based on an analysis of detailed historical data about Customer XYZ’s previous browsing and purchase behavior) from the data warehouse at run time. Let’s call this Type 1 Business Activity Monitoring and observe in passing that it may be the CEP system that “closes the loop” and/or the Data Warehouse, through, for example, the Trigger-based invocation of a Stored Procedure (SP) or External Used Defined Function (UDF).

We have situations where the observation can take place upstream, but where proper orientation will require contextual information that can only be found in the Data Warehouse. For example, if I go overdrawn on my current account, this clearly represents a business event that my Bank may need to respond to. But to understand whether this is a significant event, my bank should probably understand whether I often go overdrawn at this point in the month; whether I generally repay the debt promptly; and what my current debt-to-equity ratio is across all of my holdings. Let’s call this Type 2 Business Activity Monitoring and note that “closing the loop” may be fully automated and managed by the CEP and/or Data Warehouse systems, or semi-automated so that the machines look after the straightforward decisions, while the edge cases are escalated to a human with more subtle powers of discretion. (Maybe I do generally repay promptly, but my debt-to-equity ratio has been worsening for some time and I recently stopped paying off my credit card balance in full each month.) Clearly, decision latency will increase whenever we get a human actor involved, so the extent to which this is possible and/or desirable is very application specific. When the clock is ticking, as it is when I am standing at an ATM, only a fully-automated decision will do; if I am applying online for a car loan, we may have a little longer to make and communicate a decision, and human intervention and oversight becomes possible…at least for some of the decisions, some of the time.

Finally, we have situations where even the detection of the event will require us to analyze temporal data, integrated across channels. For example: if the significant event that we care about is when B, C, and D don’t happen 10 minutes after A; or if we want to understand whether a high/low sensor reading is really anomalous, based on pattern matching of similar incidents recorded in the detailed historical data in the Data Warehouse. Lots of preventative maintenance applications work this way and closing the loop frequently—albeit not necessarily—involves a human actor balancing the different risks inherent in the situation. Let’s call this Type 3 Business Activity Monitoring.

So, applying the OODA loop to low-latency analytics—and associated Enterprise Architecture—gives us five different styles of analysis and associated design patterns, as illustrated in the figures below.

What I think is most interesting about using the OODA loop to classify near real-time use cases in this way is that it reveals just how important it is to integrate the CEP and the Data Warehouse platforms. In isolation, neither technology can address more than two of these five patterns; combined solutions can cover the entire problem space.

All of which is why I am particularly excited that we have recently relaunched our partnership with TIBCO. TIBCO is, of course, about much more than just Complex Event Processing; in particular, the Spotfire analytics and visualization tools are excellent and well integrated with both Teradata and Teradata-Aster. But the combination of the TIBCO analytic and CEP technology and the Teradata RDBMS technology is a very powerful one for near real-time analytics. I hope that you don’t plan to shoot your competition down in flames—but if you can OODA faster than they can OODA, you should nevertheless have a distinct competitive advantage.

6 Comments

Steve

Ahh, but then there is the concept of ruse or feint which after showing your hand makes you start back at square one in the cycle; another great point is variable. Going through the process too quickly subjects one to risk. Risk is misinterpretation. Risk is just plain getting it wrong. Do that and you lose every time. The problem with these six sigma types is that they see the world in the form of mathematical equations and strive relentlessly to eliminate what they call error which is that which is incalculable or variable. Embracing variable to one’s advantage is the difference between success and failure. Take the example of the 300 brave spartans taking on the great persian army. Terrain was the one sole variable and by all accounts those 300 inflicted irreversable loss on their opponents. OODA? Take it with a grain of salt. Its conceptual and although has applicability in ground operations, it has and continues to be overemphasized and incorrectly applied.

Quote: “we should push as much of the decision making process ‘upstream’ to a point as close to the event and action as we can.”

Reflection and caution have their place too. Push deciding and acting too far upstream and you may be responding to mere noise or to an event contrived by a clever opponent.

For instance, on D Day, there were thousand of Allied ships bound for Normandy. We could not hide that from German radar. But we did contrive bogus radar signals that were so carefully timed, they looked like a large fleet heading for Calais. That fake data helped keep German tanks away from the Normandy beaches until it was too late.