Why JAQL?

By definition - JAQL (Query Language for JSON ) is a powerful query language that provides the abstraction for running map reduce jobs on a hadoop system. It can be thought of as a middle layer (that lies above the storage and below the Business Intelligence) that can be used to perform efficient transformational analytics on large data sets (structured and semistructured) such as transform, grouping, aggregation etc.

Another way to look at it as a integration point between various analytics platforms and various data sources. Whether your data resides in NoSQL databases or operational relational sources, JAQL has IO adapters that lets you connect to them and after performing the basic transformations, it can let you connect to higher application components such as Text Analytics (BigInsights Text Analytics) or Machine Learning or BigSheets

In yet another perspective, JAQL can be viewed as a programming language with support for functions, modules and flow control. Apart from core JAQL operators (such as transform, aggregate etc) SQL can be used to exceute queries thus enhancing the expressivity of the language including complexities such as joins, sub queries etc.

Why another language when there already exists analogous PIG and HIVE? While JAQL combines the best of both worlds and also borrows the best features of SQL, XQuery etc, there are three main differentiating factors in favour of JAQL.

1) Flexible data model - JAQL can work on data with no or partial schema and of course rigid schema. So for example - you can define a set of student records like this:jaql> a = [{ id:12, marks:97.50, pass:true, msg:"Hello World"},{ id:13, marks:7.50, pass:false, msg:"Better Luck"}];And access an element of the record like this to get the msg printed out on the JAQL consolejaql> a.msg ;"Hello World"This flexibility comes in handy when there's dynamic and continuously evolving data - in the unstructured, semi structured and structured forms - examples - financial data, log data, medical data, web crawl data.

2) Reusability and Modularity - JAQL can let you encapsulate lines of code into functions and further build a set of functions into modules that is related to a particular application or functionality. Along these lines of thought -there are several out of the box modules in JAQL - such as Avro module, Hbase moudle, JDBC module, Netezza module, R Module etc

3) Physical Transparency - This feature of JAQL allows you to mix high level declarative queries with low-level expressions, thus giving the developer precise control for optimizations.

More info on this can be found in the link below. http://publib.boulder.ibm.com/infocenter/bigins/v1r3/topic/com.ibm.swg.im.infosphere.biginsights.doc/doc/t_analyze_bd_jaql.html