In this video from SC16, Eric Fox from Intel describes how Kx software used a 1.2 billion record database of New York City taxi cab ride data to demonstrate what the Intel Xeon Phi processor could mean to distributed big data processing.

With adoption of big data solutions accelerating across many industries, the need for powerful data crunching solutions capable of handling massive data sets in near real time has never been greater. But the complexity and high costs of architecting and maintaining streaming analytics solutions often make it difficult to get new projects off the ground. That’s part of the reason Kx, a leading provider of high-volume, high-performance databases and real-time analytics solutions, is always interested in exploring how new technologies may help it push streaming analytics performance and efficiency boundaries. The Intel® Xeon Phi™ processor is a case in point. At SC16 in Salt Lake City, Kx used a 1.2 billion record database of New York City taxi cab ride data to demonstrate what the Intel Xeon Phi processor could mean to distributed big data processing. And the potential cost/performance implications were quite promising.

A singular focus on fast, efficient data processing

Kx was founded to overcome the limitations of traditional databases for dealing with rapidly escalating volumes of data. As a go-to trading application database, analytics and regulatory reporting solution provider for the financial services market for more than 20 years, Kx has been developing big data solutions since before the term was popularized. Simon Garland, chief customer officer at Kx, said that the company’s technology is particularly well suited for time series data, which is being increasingly employed on the Internet of things (IOT) and across the utilities and telecom industries, in addition to finance.

The kdb+ database, which includes a built-in programming language called q, can provide high-precision timestamping down to the nanosecond. “Really being a very high performance database is the sort of raison d’etre of our technology, and so we’re always looking for new technology that can help us take advantage of any aspect of that. In particular, we are very interested in the combination of high-performance memory and devices like the Xeon Phi,” explained Glenn Wright, Systems Architect at Kx. Wright noted that kdb+ is also very adept at combining high volumes of in-memory time series data with vast stores of on-disk data in a parallel fashion for near real-time analysis. Given its ability for efficiently exploiting memory as well as architectures with many threads or cores, the Kx team was eager to test the potential of their kdb+ database on Intel® Xeon Phi™ processors.

Early access, promising results

Given the often complementary nature of their technologies, Kx and Intel have a close working relationship that goes back nearly a decade when Intel released its first multi-core processors. More recently, Kx worked with Intel’s new vector instructions, which are exploited with its array programming language q to achieve profound performance improvements for its customers. “We’re interesting to Intel because our technology works on massively parallel architectures with no custom parallel coding required. If someone has a machine with 100’s or even thousands of cores in it, we run the code in parallel under the covers and the users see enormous speedups straight away,” explained Garland.

When Intel began talking about the Knights Landing project (the codename for Xeon Phi at the time), the Kx team agreed that Knights Landing looked promising on paper, but was also well aware that capabilities don’t always manifest as expected. So when the Kx team was given an early opportunity to test new Intel Xeon Phi processors it jumped at the chance. “Long story short, not only was initial testing positive, we were able to use our existing kdb+ implementation and run it on the new platform with only trivial changes,” said Garland. “In addition, the memory performance and compute attributes of the package were very good. So given that, we scratched our heads and asked ourselves what we could take a look at to articulate this.” That’s when the team saw an opportunity with analyzing publicly available data about taxi rides in New York City from the New York City Taxi and Limousine Commission.

Big data analysis in a New York minute

The Kx development team’s idea was to put together a proof of concept for streaming analytics on a standard Intel Xeon Phi architecture using the 1.2 billion data points (roughly 200GB) of taxi data that represents all of the taxi trips taken in New York City between 2005 and 2015. “The game was to show, in what I would call real time, that we could perform fairly open-ended analytical queries on this data. So in terms of just baselining it, we took four of the classical queries made against this data and synthesized that,” explained Garland. “So, for example, we were able to run a query that expressed the number of taxi trips taken by the types of taxi vendors. When all was said and done, we got millisecond response times on these queries. If you were to compare that to a more traditional architecture and software solution doing this, that query may have taken many minutes or in some cases even hours,” added Garland.

Millisecond complex query response times on a standard platform

Garland said that the port of the binary compatible kdb+ database to the Intel Xeon Phi platform was easy with only tiny changes required. The demo team was able to exploit all of the Intel Xeon Phi execution cores (the standard architecture has 256 threads) and to load queries across each of the cores. “The real dividend for us was that we were also able to exploit the internal very fast memory that is found on these systems. We had four servers with 64 cores per server and four threads per core. That’s what gives us the extremely fast execution times,” said Garland. The demo team also demonstrated that it could easily distribute queries to each server and scale perfectly. “I think the reason we were able to get such good numbers was because we were able to very efficiently get the 200GB of data into extremely fast memory and then throw essentially 1000 execution threads against it,” added Garland.

Seeing new possibilities

Garland noted that there are some very promising use cases for running kdb+ on Intel Xeon Phi processors that are worth exploring. He highlighted options trading in finance and line-item reconciliations in retail as two possibilities. “For options trading, firms could use the multicore Intel Xeon Phi architecture to perform on demand stock analysis using a large number of records and tables much faster,” explained Garland. “In retail, reconciling line items per order per store may sound trivial on the surface, but it traditionally has been very difficult to do in near real time. To perform near real time analysis, you need very fast memory on many, many cores that are as close to their data as possible. We recently demonstrated this at a retail customer site using Intel Xeon Phi achieving results that were whole orders of magnitude better than the customer was used to seeing,” he added.

Story contributed by Sean Thielen, a Portland, Oregon based technology writer.

Resource Links:

Latest Video

Industry Perspectives

In this special guest post, Axel Huebl looks at the TOP500 and HPCG with an eye on power efficiency trends to watch on the road to Exascale. "This post will focus one efficiency, in terms of performance per Watt, simply because system power envelope is a major constrain for upcoming Exascale systems. With the great numbers from TOP500, we try to extend theoretical estimates from theoretical Flop/Ws of individual compute hardware to system scale." [Read More...]

White Papers

Get your results to market faster, simplify operations, and save money with flexible, configurable AWS HPC solutions that are proven to drive results for companies large and small in nearly every industry. Download the eBook to learn more.