The stock market as a single, very big piece of multithreaded software

Ars takes a look at the recent SEC post mortem on the May 6 Flash Crash, and …

On May 6, 2010, I got a phone call from a good friend of mine who day trades. "Are you watching the market?," he asked. "I've been keeping an eye on it," I said. "It's down about 200 points." I had Bloomberg's site open in a browser window on one of my monitors, and it looked to be a fairly typical down day. "No," he replied, "it's down like 1,000. I wouldn't be calling you if it were down 200 points."

My friend knows that I follow high-frequency trading (HFT), and it was immediately apparent to him and everyone in the market that day that the machines were doing something really screwy—hence the phone call. But by the time I learned that all hell had just broken loose, the drama was mostly over. In the space of a few minutes, the complex, highly interconnected, tightly coupled computer system that we quaintly refer to as a "market," had crashed and then recovered. The Flash Crash (of 2:45pm) wasn't quite a kernel panic, but the SEC's recently released postmortem inadvertently suggests that it may as well have been.

My former academic advisor, theologian Harvey Cox, once famously argued that, to a student of religion like himself, the market looked an awful lot like a deity. In that vein, I'll suggest that, to this student of computer science, the market as described in the SEC report looks like an awful lot like a giant, multithreaded software application. And on May 6, the market did what every piece of multithreaded software eventually does in response to just the wrong mix of execution conditions and inputs: it crashed.

The end of auditing: the Data Deluge Makes the SEC Obsolete

The first lesson of the SEC report is "be careful what you wish for."

The computerization of the market was supposed to bring about a revolution in transparency. Unlike the market of an earlier era, where humans executed trades by talking to (and shouting at) one another, the electronic communication networks (ECNs) that emerged in the late 70s logged every detail of every trade for later auditing. No more "he said, she said" when resolving a dispute or ferreting out fraud—just go to the tape. But then came the flood.

After a solid decade of moving almost all trading activity onto electronic systems (the NYSE floor is just there for show at this point), the market generates so much data that it's nearly impossible for a mere governmental agency like the SEC to analyze. There are literally tens of thousands of quotes per second in hundreds of thousands of symbols across multiple electronic exchanges—the SEC would need the brain and computer power of the NSA to even begin to do a credible job of crunching this many numbers for a credible post mortem.

The SEC acknowledges in the report the massive problem that this amount of data poses. While the agency did what number crunching it could, it also had to rely heavily on interviews with market participants. These participants each have their own, uncrunchabley huge mountains of data, which have led them to their own conclusions about what went on between about 2:30 and 3:00pm that day.

Again, the shift to ECNs was supposed to prevent exactly this problem by giving regulators complete transparency into every last detail of every trade. (And again, be careful what you wish for.)

The amount of data isn't just a problem for regulators. Much of the report details how the systems of the market participants were themselves overwhelmed in real-time with the sudden surge of digital information. Processing began to slow, queues filled, backlogs developed, and machines were eventually pulled offline as the humans intervened and tried to sort out possible data integrity issues.

Beyond the challenges of reconstructing events, the traders also use some subset of the data firehose that the market's machines throw off today as input to train the algorithms that will run the market tomorrow. So at some point, we'll wake up and realize that it's really turtles machines all the way down. Put that in your bong and smoke it, Keanu.

Don't blow on it or it might fall over

The second lesson of the SEC report is that the market is fairly fragile, which is about what you'd expect from a giant, multithreaded computer that has been brought online, piecemeal, with no oversight. The wrong input at the wrong moment could trigger a race condition, or a deadlock, a livelock, or some other concurrency hazard that brings it all down.

The report fingers a large derivatives trade by an unnamed fund (which we now know is Waddell & Reed) as the spark that caused the conflagration. A clumsily executed selling algorithm began dumping a type of futures contract called the E-mini into an already volatile market, and a group of HFTs absorbed the initial wave of sell orders and began passing them back and forth rapidly, buying and selling the contracts amongst themselves without any one HFT ever intending to build a net long or short position.

(Now, to explain why a bunch of algorithms would execute a burst of trades with each other that were intended to have no real net effect on anyone's position would require an explanation of liquidity rebates and market making. That's too deep in the weeds for this post, but here's my nutshell explanation: the market-making HFTs get paid by exchanges and dark pools to provide liquidity in certain stocks, but what many of them actually provide are mere quotes—tens of thousands of quotes per second, without any real money behind them. The only real purpose of this is to give the appearance of liquidity in a stock so that the algo can get paid for doing its tiny part to keep the market liquid and orderly.)

So when the HFTs got hold of the futures contracts that the Waddell & Reed algorithm was selling, they would've first looked for legit buyers to flip them too. And, not finding any legit buyers, they just started rapidly buying and selling amongst themselves to at least pick up some rebate money. The poor, stupid Waddell & Reed sell algorithm saw all of this buy/sell activity suddenly spring up, and it mistook the spike for real, active interest in its giant batch of futures contracts. So it (logically) capitalized on that interest by increasing the pace of its sales; it wanted to do all its selling while the market was (seemingly) hot.

But the sell algorithm was mistaken—that HFT-induced spike in trading activity did not reflect any sort of genuine market appetite for the futures contracts it wanted to sell. It was just the machines playing "pass the potato," almost certainly for the purpose of generating tiny rebate profits.

The SEC describes these shenanigans in two different, widely quoted sections:

Still lacking sufficient demand from fundamental buyers or cross-market arbitrageurs, HFTs began to quickly buy and then resell contracts to each other—generating a “hot-potato” volume effect as the same positions were rapidly passed back and forth. Between 2:45:13 and 2:45:27, HFTs traded over 27,000 contracts, which accounted for about 49 percent of the total trading volume, while buying only about 200 additional contracts net...

...Furthermore, 16 (out of over 15,000) trading accounts that were classified as HFTs traded over 1,455,000 contracts on May 6, which comprised almost a third of the total daily trading volume. Yet, net holdings of HFTs fluctuated around zero so rapidly that they rarely held more than 3,000 contracts long or short on that day. Moreover, compared to the three days prior to May 6, there was an unusually high level of “hot potato” trading volume—due to repeated buying and selling of contracts—among the HFTs, especially during the period between 2:41 p.m. and 2:45 p.m. Specifically, between 2:45:13 and 2:45:27, HFTs traded over 27,000 contracts, which accounted for about 49 percent of the total trading volume, while buying only about 200 additional contracts net.

At some point, the HFTs realized that they had built up a net long position in the E-mini, so they started dumping it. This HFT selling combined with the still-ongoing selling of the Waddell & Reed algo to create an exceptional amount of sell pressure in that contract. In spite of all the selling, there were no buyers, so the contract's price began dropping like a stone.

Meanwhile, because the derivatives market is linked to the underlying equities market by any number of instruments (e.g., ETFs) and strategies (e.g., cross-market arbitrage, hedging), the selling pressure in the E-mini quickly translated into selling pressure in the equity indices, and that's when the real party started.

The algorithms that buy and sell stocks in the equities market were using the previous few minutes' action in the derivatives market as inputs to guide their trading, and when they registered the giant sell-off described above, some of them had safety controls that told them to stop trading so that the humans could take a look to see what was wrong. And as these algorithms pulled out of the market, the market got more illiquid and prices dropped faster.

Other algorithms saw the precipitous price declines across the derivative and equity markets, and assumed that some type of cataclysmic event had happened; so they, too, quit trading. Still others couldn't handle the sudden surge in market data coming their way, so they checked out. Others didn't trust the data they were getting, so they stopped trading until the apparent technical errors could be sorted. The net effect was that all across the market, the machines began either selling or shutting off, all at the same time. With all the buyers suddenly gone from the market, prices immediately tanked, all the way to zero in some cases.

Once everyone realized that the world hadn't ended, and that the massive sell-off was globally irrational (despite how locally rational it had seemed to the individual algorithms), they jumped back into the pool as quickly as they had gotten out. And the market magically levitated right back up.

A giant, multithreaded computer

The final, and perhaps most important, lesson of the SEC report, at least for a computer person like myself, is that the market is behaving as one giant multithreaded software application.

Now, to be a single multithreaded app, as opposed to an unrelated collection of multithreaded apps, the different threads must somehow interact with one another. In other words, the threads must share and jointly modify some kind of state.

What state do the various apps and algorithms that run on Wall Street's machines share? At the very least, every part of the market shares the quote feed, and some parts are even more tightly coupled than that. But let's focus on the quotes.

The price of, say, AAPL at any given moment is a numerical value that represents the output of one set of concurrently running processes, and it also acts as the input for another set of processes. AAPL, then, is one of many hundreds of thousands of global variables that the market-as-software uses for message-passing among its billions of simultaneously running threads. Does it really matter that those threads are running on separate machines at different institutions?

I throw this perspective out there for open discussion by the Ars community, and I'd be quite happy if someone could demonstrate that I'm crazy. Because if I'm not crazy, and the market really is essentially an enormous piece of multithreaded software, then I'm not entirely sure what kind of rabbit hole we've all gone down. I hope the computing field's experiences with large, multithreaded, distributed software (e.g., cloud computing) isn't indicative of what we're all in for.