There have been several requests for standardisation of the SQL used by GaianDB to query its logical tables..There is in fact a very simple way of getting around the problem using Derby views, e.g.

CREATE VIEW V_LT0 AS SELECT * FROM NEW com.ibm.db2j.GaianTable('LT0') T

Now, the logical table 'LT0' can be queried through the view 'V_LT0', i.e. the following 2 statements are equivalent:

SELECT * FROM NEW com.ibm.db2j.GaianTable('LT0') TSELECT * FROM V_LT0

.. The only drawback is that you need a separate view for every combination of arguments used in conjunction with the logical table, e.g, to have a result that includes all columns of the logical table and also columns describing where in the network the data came from:

We have recently been focusing on improving performance and scalability...

Take a look at these visualisation graphs for GaianDB networks of up to 520 nodes! - These were running on 13 blades, each having 4 logical CPUs

http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations?q=gaian

The next release of GaianDB (1.02) expected this month will contain many optimisations. In particular, we have looked at minimising CPU and memory consumption of GaianDB nodes. We have also improved the node discovery and connection algorithm to minimise network diameter and hence query times across the network from any node.

For reference, simple query times across a network of 52 nodes were initially taking 89ms *before* optimisations...[Read More]

I have been working with the Gaian Database recently to demonstrate its scalability.

In the tests I grew a database cluster with over a thousand Gaian Database nodes and measured the time it took to query across these thousand nodes, and fetch over a million rows of data. I also tested the impact on speed of executing multiple queries at the same time.

I will include more detailed postings on each of these three cases, but the high level results are as follows:

Query Time – We are able to query all 1000 nodes in about 1/8 second. The results show that the query time grows logarithmically - in other words as you add more and more databases, the increase in query time slows down, providing excellent scaling. The way that a Gaian Network is grown from individual nodes automatically ensures this behaviour.

Fetch Time – We are able to fetch 1 million rows of data in under 5 seconds. The fetch time is proportional to the amount of data returned so that if you fetch twice the data it takes twice as long regardless of which of the 1000 nodes the data resides in. The Gaian Database actively pre-fetches the data from all the nodes to achieve this scalability

Concurrent Queries – I injected queries from up to 40 nodes at the same time, the Gaian Database showed that it could handle these queries robustly with a modest increase in the query time due to running out of available processor time on our test platform.

There have been a number of changes to the Gaian code to achieve these results, a new release will be delivered to Alphaworks soon.

Check out the following link for a visualization of 1250 Gaian Database nodes in a network:

Further, if the nested query contains a distributed table or query, it may only reference the provenance, explain or constant columns if these are renamed within the nested query. This is because otherwise they would be seen as potential duplicates with the columns defined by the outer GaianQuery().