If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Welcome to the new and improved VoltDB Community Forum. Due to recent upgrades, all users must click here to update their passwords to access the site.

voltQueueSQL and voltExecuteSQL

Please explain why it is necessary or desirable to queue multiple SQL statements using voltQueueSQL before executing them in one batch using voltExecuteSQL? Does voltExecuteSQL implicitly commit all the work performed by the batch of SQL statements?

Desirable.

In a single-partition transaction, transactions often take tens of microseconds. The context switch between the JVM running the procedure code and the C++ execution code is one of the most expensive parts. That's not to say it's slow, it's just that the rest is really really fast. Batching SQL statements reduces the number of context switches and makes the procedure faster.

In a multi-partition statement, each batch represents a network round trip. Reducing the number of batches is obviously important. Additionally, we're working especially hard at making single-batch procedures faster (one-shot), as they seem to be a common case.

Thank You

In a single-partition transaction, transactions often take tens of microseconds. The context switch between the JVM running the procedure code and the C++ execution code is one of the most expensive parts. That's not to say it's slow, it's just that the rest is really really fast. Batching SQL statements reduces the number of context switches and makes the procedure faster.

In a multi-partition statement, each batch represents a network round trip. Reducing the number of batches is obviously important. Additionally, we're working especially hard at making single-batch procedures faster (one-shot), as they seem to be a common case.

C++

In a single-partition transaction, transactions often take tens of microseconds. The context switch between the JVM running the procedure code and the C++ execution code is one of the most expensive parts. That's not to say it's slow, it's just that the rest is really really fast. Batching SQL statements reduces the number of context switches and makes the procedure faster.

In a multi-partition statement, each batch represents a network round trip. Reducing the number of batches is obviously important. Additionally, we're working especially hard at making single-batch procedures faster (one-shot), as they seem to be a common case.

Language separation

Originally Posted by henning

What part of VoltDB is written in Java and C++ respectively?

Thanks,
Henning

The storage layer and SQL execution is written in C++. The transaction manager, networking and messaging is written in Java. There is an interface between C++ and Java implemented with Java JNI for production. For testing only, this interface is also implemented over socket IPC serialization, in which case the C++ layers and Java layers are run in separate processes. (Having a C++-only program makes using some verification tools easier.)

How VoltQueueSQL works??? How many VOltQueueSQL can we have in a single program????
DO every VoltQueueSQL should have an associated VoltExecuteSQL????

If in a client application i have
myApp.callProcedure("insert", "English", "Hello", "World");
myApp.callProcedure("insert", "French", "Bonjour", "Monde");
myApp.callProcedure("insert", "Spanish", "Hola", "Mundo");
myApp.callProcedure("insert", "Danish", "Hej", "Verden");
myApp.callProcedure("insert", "Italian", "Ciao", "Mondo");
does this means that in one database connection all the statements will be executed??? How in this example queue comes into picture.

I tried an example with 3 VoltQueueSQL and 1 VoltExecuteSQL , i observed that if i am using more than 2 VoltQueueSQL than it gives me an error for the StrngIndexOutofBound.

VoltQueueSQL and VoltExecuteSQL are VoltDB stored procedure semantics, you don't use these in your client program, I think you are confusing these. Your client program invokes the stored procedure (using callProcedure), in your case its called 'insert' once it has an open connection estabished with the cluster. The stored procedure 'insert' runs on the cluster (server side) and executes queries using VoltQueueSQL and VoltExecuteSQL.

From the manual, some other key points that relate to your questions are:

#5. To perform database queries, you queue SQL statements (specifying both the SQL statement and the variables to use) using the voltQueueSQL method.
#6. Once you queue all of the SQL statements you want to perform, use voltExecuteSQL to execute the statements in the queue.
#7. Each statement returns its results in a VoltTable structure. Because the queue can contain multiple queries, voltExecuteSQL returns an array of VoltTable structures, one array element for each query.

As Phil pointed the key difference is that queueing SQL and executing them with one voltExecuteSQL is a server side concept that is useful for processing multi partition activity very efficiently. For example in he hello world example you had referenced in your post, each of the callProcedure statements are a full round trip between the client and VoltDB server(s). But the advantage is that you can send them off asynchronously. You can instead convert it into a single call to, say, a stored procedure called InsertManyGreetings that takes three String arrays of equal size. This stored procedure in turn can queue multiple SQLs with parameters extracted out of the arrays and then executed in one shot by calling the voltExecuteSQL. This would distribute the SQLs queue accordingly to all the partitions to execute.

In general, your inserts should be made single partitioned and asynchronously, since this would be extremely fast and queueing multiple SQL statements should be reserved to use where you need to run a set of SQLs and none of them are dependent on any other of the set to run successfully. When you are queueing multiple SQLs also make sure to call the voltExecuteSQL with a boolean true as a parameter. This indicates to all the partitions that the work is done and they can continue to serve the single partition requests.