While connecting to my application (running on a jetty server), I am able to properly connect and and run queries over Teiid when I use the SocketConnection (using URL format: jdbc:teiid:Portfolio@mm://localhost:31000). But when I try to create an EmbeddedConnection instead (using the jdbc:teiid:Portfolio) inside the same JVM, I am not able to create a connection in my setup. The driver.connect() call fails giving me 2 kinds of exceptions.

I have used "EmbeddedServer.getDriver().connect(...)" before and seen that work fine for creation of an Embedded connection.

But from the looks of the code in the Teiid JDBC driver, it seemed to me that it's based on the URL that Teiid would either create a LocalProfile object (a new one using the standard JDBC construct or pass the object created during Embedded server startup when using embeddedServer.getDriver().connect())

In my current use case I want to create an Embedded conn using the JDBC driver from a standard code like the one below.

Is there no way to create a local/embedded (non-socket) connection while connecting through the JDBC driver using the standard JDBC API construct ?

The issue is getting access to the EmbeddedServer in the VM, since embedded does not use any kind of injection frameworks like Guice, or Spring one must provide access to it. In the WidFly environment, we use JNDI services to get to the EmbeddedServer instance.

So, to make it work in your environment, you can devise a way to provide EmbeddedServer, and provide that with TeiidDriver.setProfile(...) method. See I did the exact same thing here. [1]

Thanks for that clarification Ramesh. I'll see how I can wire such a setup to pass on EmbeddedServer instance at the right place.

One more question, since I am down the embedded path, is there a difference in the further code flow between when we go via Socket vs embedded connection? (specially on how multi-threading comes into the picture while processing queries and returning results).

Do you see any obvious reasons due which I might see noticeable performance gains/reduction while running queries? (apart from the transport layer considerations)

No, once you get the JDBC connection access is same. However in embedded case there are special cases where you can use the calling thread to use do more processing than just waiting around. Also, there are ways to pass the security around from existing context.

Thanks for the response earlier on. I was able to get things running by somehow providing the driver from the Embedded server instance to my client end of the app.

> there are special cases where you can use the calling thread to use do more processing than just waiting around.

I see that useCallingThread variable is set to true by default. What's the exact idea behind this flag's functionality? Does that mean Teiid would use this one extra thread in addition to the ones that are initialized by the TeiidExecutor/WorkManager, and hence should show a perf imporvement OR is this thread used for something else?

For testing purposes I created a wrapper driver which handles whether it should fetch the driver from EmbeddedServer or create a socket connection (similar to what Teiid already does for itself). While testing I noted a good bit of perf gap between when I ran with embedded connection vs socket connection. The socket one was much faster to my surprise. I tried setting the useCallingThread flag to true/false - which showed no effect on the numbers.

For a dataset of 16 cols, 3Mn rows, where the socket conn is taking ~26 secs to read, whereas embedded takes ~2mins for the same.

In addition to this, I have also plugged in my own very basic WorkManager to handle security context of the new threads. Here CustomWorker is just a plain wrapper over a Work object:

public class CustomWorkManager implements WorkManager

{

private ExecutorService executorService;

public CustomWorkManager()

{

executorService = Executors.newFixedThreadPool(64);

}

@Override

public void scheduleWork(Work work) throws WorkException

{

executorService.submit(new CustomWorker(work));

}

// All implemented methods..

@Override

public void doWork(Work work) throws WorkException

{

.........

.........

}

Could you suggest whether this behavior is expected or am I doing something in my usage of embedded connection?

>What's the exact idea behind this flag's functionality? Does that mean Teiid would use this one extra thread in addition to the ones that are initialized by the TeiidExecutor/WorkManager, and hence should show a perf imporvement OR is this thread used for something else?

In embedded scenario, since it is in single VM, the thread which is making the JDBC query instead of waiting for the response, the same thread will be used for part of the processing as JDBC is blocking protocol.

Your findings surprise me, in embedded case, there is no marshaling of results over the network so that should be much faster. I suggest you can do profiling of the app to see where the time is being taken. A tool like VisualVM should be able to help this regard.

Thanks Ramesh. I was taking note of the thread activity in socket vs embedded connection. I noticed that while running on socket I saw that all the initialized Workers (Teiid threads) were used simultaneously whereas in the case of embedded connection I noticed that only one thread was getting used for the same amount of data, and the same config of the querying application. Attached screenshots should help you understand what I am talking about.

These screenshots provides good the reason for perf gap. I tried toggling the useCallingThread flag too just to try out, but that didnt show any noticeable improvement here. So is there some extra configuration/tweaking that I am missing here that I am seeing this single threaded behavior?

[I am running this query over MS SQLServer source model having a table with 3Mn rows (No view model). This is Teiid embedded v9.2.1. I did not explicitly set theUserRequestSourceConcurrency, hence its using the default value of 6]

Are you using a single JDBC connection to submit multiple queries? I am asking the reason for multiple processing threads in socket scenario? Typically there may one active processing thread query in either scenario. When useCallingThread=true is used, that thread becomes the processing thread. Also, useCallingThread=false should restore to same thread activity minus the marshaling part in embedded connection vs socket connection. If the behavior is same, I suspect this flag is being overwritten somewhere.

Yes its a single connection made on top of the VDB for one query run - one connection made using the embedded url, and other using the longer socket URL to establish a socket conn. Its a simple select statement to a single table -> SELECT * FROM "sourceModel"."tableName".

I ran the same scenario again- I see Teiid making all the 4 worker threads active (green in the screenshot) the moment query is run over socket. Only 1 worker thread becomes 'green' for when I run the embedded query. The same client reader code (in the same JVM) is used to query this deployed VDB.

Though I understand the fact the query processing, especially reading from a single table of a single source should be in a single thread (can't guess how this read can be multithreaded), but looking at the thread heat map, it seems like some multithreading activity on Teiid workers is making the socket code work faster. Not able to figure out the reason for difference yet, will look into it. But do you not see this behavior/activity at all when you run this scenario?

Not really, but we have not checked in recent days. As per having 4 threads, the engine may be initializing threadpool but using just one thread. Single query can be multi-threaded, where Teiid engine is using effectively one thread (it will context switch across available thread), the connector layer uses another thread(s). Connector layer uses another because if that is using another blocking call, the main thread is not blocked.

I tried creating a sample app from scratch with just starting Teiid embedded server without any other configuration done on it, and running queries on a created model. I was able to see the expected behavior there, embedded being faster. I can't find any reason yet for my environment being different with respect to Teiid, but I'll keep looking more to see whats happening differently.