Science and technology

Supercomputers

Game on

“MONEYBALL”, the book and subsequent film of the same name, put a spotlight on the role statistics play in professional baseball. The story depicts how the Oakland A's figured out new ways to use historical data about player performance to assemble a winning team (despite a relatively small budget).

“Moneyball” focused on the A's 2002 season, however, and so did not scrutinise what most people think of today as big data. The number crunchers who looked at player statistics to make decisions back then, for example, worked on regular PCs.

But 95% of all data collected over the 140-year history of major league baseball (MLB) has been generated in the last five years, according to Sean Lahman, a baseball expert and journalist. With new troves of information, teams can make decisions entirely different from those central to “Moneyball”.

Consequently, one MLB team has invested in a Cray supercomputer according to Pete Ungaro, the company’s chief executive officer. The team, which declines to be named, exemplifies an organisation that, five years ago, most people would not have dreamed would need, or even want, a supercomputer, he says.

The team obtained one both because the machine has the capacity to analyse enormous quantities of data and because of the short time in which it can process them. Other technologies, such as cloud computing, could wade leisurely through information, helping managers make choices during the off-season (perhaps concerning which players to add to the roster, for example). Instead, a team can use a supercomputer to process data in time to affect decisions during play, explains Mr Ungaro. Cray's Urika appliance, launched two years ago, is specifically designed to help users interpret data in unusual ways.

It's targeted at a new breed of supercomputer user: one from among the variety of organizations collecting mounds of data these days, such as online retailers, mobile game developers and fitness tracker makers.

Historically, the market for these machines was dictated by the whims of governments. Companies like Cray, IBM and Fujitsu would profit from political decisions—the agreement by numerous governments to stop testing nuclear weapons by actually setting off bombs, for example. Many subsequently invested in supercomputers to simulate the impact of nuclear weapons.

In times of austerity, however, the supercomputer makers struggled to stay afloat. The end of the Cold War saw government defence budgets slashed, impacting the supercomputer market. It hovered around $2 billion in revenue in the second half of the 1980s, but turned over just $400m per year in the early 1990s, according to Steve Conway, an analyst at market research firm IDC.

Vendors now hope that the recent boom in data collection might drive sustained demand for supercomputers long into the future. IDC predicts that the supercomputer segment will grow by 30% from 2012 to 2017. Cray seems to be benefiting from the trend: it recently reported annual revenue that topped $500 million, and expects closer to $600 million this year.

Mr Ungaro reckons that the MLB team in question is among a group of early supercomputer-adopters. He was not, however, able to name many other similar examples. The Institute for Systems Biology used a Urika appliance to find areas where existing drugs could be used in novel ways to fight disease. The Institute does not own a Urika though—it used one of Cray's as part of a scheme encouraging groups to work with the advanced machines. In 2011, PayPal bought a supercomputer from Cray competitor Silicon Graphics International in order to analyze transactional data in real time. This was part of a scheme trying to detect fraudulent purchases before credit cards were charged.

Whether the appetite of firms, researchers and teams for big data can sustain the supercomputer industry remains to be seen. The situation may instead be akin to the 1907 and 1908 baseball seasons for Cray and other makers—the only years the Chicago Cubs managed to win the World Series.

Readers' comments

The situation may instead be akin to the 1907 and 1908 baseball seasons for Cray and other makers—the only years the Chicago Cubs managed to win the World Series.
.
And wouldn't it be gloriously symmetrical if the team that has bought the Cray turns out to be the Cubs? And if they use the results to finally(!) win another World Series. (Hey, if the Red Sox managed it at last, anything is possible.)

I don't see why people would buy these when its so simple to lease out computing power from a larger company that owns the actual hardware. Reminds me of how airlines moved from owning the planes to the use of operating leases. Why have a fixed asset on your books?

I suppose there is some efficiency, customization, and security enhancements possible where you have the equipment in-house. For financial services I can see this...but for Baseball?

Also, how well do these big data algorithms work when everyone starts using them? Will their real-time effects be nil? Or, with a proper economic incentive in place, will this race spur the development of better algorithms until we get to thinking computers?

1. they couldn't possibly have enough data to require a computer with high running costs and which will need an army to run it

2. if they do, then they only need it for short periods of time - so renting is indicated

3. it's simply not possible that in a game of so many random or unknowable variables, extra computer power above what a standard computer will provide will make anything more than a tiny difference to the team's chances of winning

4. all the historical data should be pre-processed - not processed when the results are needed