Facebook Open-Sources Presto Engine

Facebook has open-sourced some interesting in-house code in the past like Flashcache for the Linux kernel, the Folly C++ library, and the HipHop Virtual Machine. The latest open-source Linux-compatible software release coming out of Facebook is Presto, their tool for interacting with petabytes of information.

Presto is a distributed SQL query engine developed in-house at Facebook that they use for scouring their 300+ petabytes of data at the social network company. Facebook uses Hadoop clusters but Hive and other existing open-source tools didn't provide the low-latency results the company wanted, so a team set to develop Presto.

Interestingly this low-latency distributed query engine is implemented in Java but is able to avoid typical issues of Java code via writing optimized code and generating some of its own byte code. Presto supports multiple back-ends and has been in development for the past year. Already the open-source tool has 10x better performance than Hive/MapReduce with CPU efficiency and latency for most of Facebook's queries. Most ANSI SQL is supported by the engine.

Facebook has made the source-code to Presto publicly available today. Details on the project can be found at Facebook.com while the code can be found on GitHub.

Michael Larabel is the principal author of Phoronix.com and founded the web-site in 2004 with a focus on enriching the Linux hardware experience and being the largest web-site devoted to Linux hardware reviews, particularly for products relevant to Linux gamers and enthusiasts but also commonly reviewing servers/workstations and embedded Linux devices. Michael has written more than 10,000 articles covering the state of Linux hardware support, Linux performance, graphics hardware drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated testing software. He can be followed via Twitter and Google+ or contacted via MichaelLarabel.com.