Forums

Versions

Kappa

What is Kappa?

Kappa is an advanced software framework that makes it easy to bring the best performance to processing data by using relatively inexpensive highly parallel CPU and GPU hardware. Kappa works with your current programming languages and data sources. Kappa is ready to improve your TCO by implementing the production work loads of clusters of hundreds of servers on just a few servers.

Kappa

The Kappa framework provides practical implementations of massively parallel processing (MPP) using CUDA GPU, OpenMP, and partitioned data flow scheduled processing. It allows processing to be specified using SQL and index component notation for dynamic scaling. The Kappa framework passes (subsets) of data sets between processing kernels and into and out of data sets. The Kappa framework provides Apache Portable Runtime (APR) database driver SQL connections to retrieve data fields from any database source.
=======================================================
For data in a database in star schema format or in an OLTP schema, you can use the Kappa framework for high speed, massively parallel processing of the data. The data is transferred in binary form using an extension of the Apache Portable Runtime parameter specification that also specifies the data structure layout for CUDA or OpenMP data structures. If your data is not in a database, you can use existing C, C++, Perl, or Python libraries within the Kappa framework to access the data.

The Kappa framework provides for:
* Transferring with as few processing steps as possible,
* (optional) primary key fields used to specify record selection for reading and updating,
* dimension and measure field support,
* the ability to normalize nonnumeric dimension discrete fields,
* and the ability to use database fields to split the task for parallel transfer and computation
provide for scalable, efficient, and high speed transfer and processing. These techniques provide for full, efficient usage of database server, bandwidth, and CPU and GPU capacity.

The Kappa framework uses producer/consumer data flow scheduling which maps well to database transactional processing. The data flow scheduling can be indexed and scaled using data from SQL operations or CPU or GPU calculations. GPU kernel launches can be dynamically sized from SQL or other data sources. The data flow scheduling is declarative using SQL and index component (tensor) notation. The data flow scheduling can be specified once and then automatically sized by the data contained in the database data set. The data flow scheduling can be compiled into shared libraries for distribution.

The Kappa framework currently has bindings for C/C++, SQL, Perl, Python, Ruby, Lua, and .Net. Java (and other language) bindings are available but not yet tested. Languages with bindings can be mixed together within a single processing task to implement different steps of the processing.

Currently, the KappaAPRDBD driver is available for use of the Apache Portable Runtime database drivers. Apache Portable Runtime database drivers mean that the drivers are accessible, supported by a community, and can work with the data sources you wish to use. The APR pgsql dbd driver is robust, performs well, and is available for most platforms.

Parallel Programming

Parallel programming is required of developers today if we want to make full use of current CPU and GPU hardware to provide software solutions to problems. There is sufficient complexity in the problems you are solving and in implementing (parallel) algorithms without focusing on memory allocation, data transfer, host best practices, and occupancy filling (maximal usage) of CPUs and GPUs (to name a few). It can impact development to need to commit, before development even starts, to a GPU versus a CPU solution for each task step. The Kappa library was created to provide the best possible solution for these distracting issues so that software developers can focus on providing solutions to the problems they are given.

How to make parallel computing easier:

Provide language bindings for C++, .Net, Perl, Python, Ruby, and other languages so that you have access and control from your program or web stack.
Support implementing algorithm steps and IO in your choice and mixture of C/C++ (OpenMP), CUDA C++. Net, Perl, Python, and other languages.

Easy data and modules (kernel functions)

Allow simple declaration of the Values, Variables, and Modules you need and the flow of data between these algorithm steps (CPU and GPU kernel functions).
Easily use CUDA source files (automatic compile if necessary and GPU specific JIT compile) to load CUDA kernels.
Easily load shared libraries and call functions (C/C++ kernels).
Make the management and movement of memory and data transparent and safe and make best practices the default (but allow access to all settings and options and interoperability with other CUDA APIs).
Allow getting values and indices from configuration files, GPU and kernel attributes, SQL data sources, kernel functions, and calculations.
Provide full SQL OLAP primary key handling, dimensions, measures, dimension normalization with binary data handling and high speed performance.

Specification that scales to fit the data and the runtime hardware for full utilization of CPU and GPU resources

Provide different index labels for different components on the flow of data.
Use the values and indices to expand the index labels into separate, parallel data flows, to size kernel launches, to provide parameters to kernel functions and SQL statements, to cancel data flows (and database transactions), to select different modules and kernels……….

Truly parallel–not just spots and regions

Have the Kappa scheduler execute your task’s algorithms steps in the proper order with the full amount of concurrent parallel execution and speed.
Only synchronize and block parallel execution to the minimal extent necessary for validity of individual data items.
Let longer running CPU or GPU kernels overlap other kernels to the extent that data flow and CPU and GPU resources allow it.

Adaptable and extendible

And let all of these specifications be automatically translated to C++ projects that create shared libraries that contain and load these specifications.
And if this is not enough, make it easy to expand with new functionality and keywords while maintaining backwards ABI compatibility.

Plus things you should be able to expect

Plus easy installation, quick start guides, rapid development and testing, a user guide, a full reference manual, and MIT License examples.
Finally, let everybody try it for free with enough functionality, even in the free version, to be a development tool that provides features not available anywhere else.

(This is not some manifesto–this is a portion of the Kappa Library manifest.)

Parallel computing made easier

The best way to understand what Kappa does is to try it. These Quick Start Guides should help you understand the basic functionality of Kappa:

Kappa Library Overview

The primary goal of Kappa is to allow for the creation of sophisticated, powerful, and complex processing that retain simple and easy-to-use interfaces. Kappa provides for creating processes with dynamic sizing, scheduling, and interactive execution for C and CUDA kernels to process data efficiently using the available resources.

Kappa provides a library for creating processes to use combinations of CPUs and a GPU for tasks. Within a single host program process, a Kappa process can be created for each CUDA GPU—using all GPUs. Each Kappa process can use all of the multiprocessors of each GPU, share all of the CPUs of the host system, have its own separate namespace, and have its own separate CUDA context.

The Kappa library provides:

access to the GPU and instantiated kernel properties at runtime

nvcc and CUDA JIT compilation loading

fully concurrent C++, C++ OpenMP, and CUDA kernel execution

easy integration with existing libraries and data formats using either C, C++, or Perl libraries

dynamic sizing of data and kernel invocation

dynamic scheduling and cancellation of the execution of related steps

process level functional blocks (named or anonymous subroutines and named functions) built from dynamically scheduled execution of mixtures of C++ and CUDA kernels

subject domain extensibility

for performance, execution is scheduled–not interpreted

The Kappa library was designed to especially ease use by providing, for example: