Microsoft Research sets new record by sorting 1,401GB of data in 60 seconds

While everyone fawns over the hot new phone or tablet coming out every other day, Microsoft Research is always plugging away doing some real computer science. The big news around Redmond this week is that a team of Microsoft researchers have broken the record for data sorting speed by a huge margin. Granted, Yahoo! held the record previously, but a win is a win.

The nine-person team at Microsoft Research was able to shuffle through data at a rate of 1,401GB in just 60 seconds. This was done with the MinuteSort benchmark, which as its name suggests, is a test that measures how much data a system can sort in one minute. Microsoft used a new distributed computing system dubbed Flat Datacenter Storage to accelerate data handling.

Because Microsoft tied together multiple scalable systems in its record run, the combined processing rate of 2GB per second on each of its 250 machines resulted in a more than three-fold improvement over Yahoo’s old record (around 500GB). The system also has a further 2GB of bandwidth free for output.

Flat Datacenter Storage isn’t just for making Yahoo! feel bad, though. Microsoft believes that Bing could benefit from the technology to improve its performance. In a more far-off future, Microsoft believes that Flat Datacenter Storage could accelerate machine learning to make software more personal to you. Looking for patterns in data, image recognition, and almost any other task that deals with large data sets could be improved with Flat Datacenter Storage.

These “big data” problems have become easier to tackle over time with technologies like Google’s MapReduce or Hadoop, but this new Microsoft breakthrough is even more advanced.