It's a girl

So for the last few hours I have been getting back into the Linq for NHibernate project, after having left it for far too long. I am beginning to think that building a Linq provider might not have been the best way to learn C# 3.0, but never mind that now.

It is with great pride that I can tell you that the following query now works:

Why is this trivial query important? Well, it is important because about a year ago I said that there is only a single feature in Linq for SQL that NHibernate doesn't have. I also promised to fix it the moment that someone would tell me what it is. No one did, so it waited until today.

The main issue had to do with the Criteria API and handling parameters, no one ever needed to do that, it seems. When they did, they generally used HQL, which did have this feature. Since I have based Linq for NHibernate on the Criteria API*, that was a problem.

Now that ReSharper works on C# 3.0, I can actually get things done there, so I sat down and implemented it. It is surprisingly difficult issue, but I threw enough code on it until it gave up (I wonder if there is a name for nested visitors... ).

At any rate, I strongly recommend that you'll take a look at the project. And bug (fixes) and other patches are welcome**.

* This decision was important for two reasons, the first was that it is generally easier to use the Criteria API programmatically, and the second was that I wanted to ensure that the Crtieria API (which I favor) was a full featured as the HQL route.

Comments

I havent seen the code yet, but "nested visitor" gives me a vision of some sort of intermediary expression structure between linq and nahibernate expressions.

Ayende, I am very pleased to hear about the revival of this project. My last version of the trunk gave me failing tests. I presumed that everyone lost interest in this project, but I never figured out why.

Maybe now that resharper finally works with linq, we will see a "spike" of new interest with linq and the other C# 3.5 features.

1) you have multiple fields in the query which are used to produce a single end value

2) you have to create in-memory delegates which are called to produce the end result.

This is only solveable with an all-purpose approach, otherwise you'll run into problems when someone instantiates an in-memory object, passes values from the projection to the ctor and calls a method on it to obtain the Real value etc.

Why would you need a nested visitor? The only reason I needed a new visitor inside a pass (I use 6 passes over the tree, rewriting it in every pass) is to lookup things at that level by traversing a subtree, but only in some occasions. (I think 'handler' or 'crawler' is more appropriate. 'Visitor' suggests the visitor pattern is implemented by MS, which isn't the case, Expression doesn't have a Visit virtual method. :( )

"Broadly, Lind for NHibernate is using two major parts of NH, ICriterion and IProjection. ICriterion is used for booleans, IProjection for selects.

The problem is that Linq mix them fairly freely, and that was never in the plan for Hibernate."

Yes, you need projections which represent 'derived tables' (SQL term) a LOT. I had to add it to LLBLGen Pro as well to make things work.

"Of course, I still think that who ever designed Linq was mad."

haha :D I agree. Some stuff is OK, but other things are flat-out stupid. I mean: who came up with the lame idea of deferred execution of linq queries, but ONLY a part of the queries is deferred executed!

This one is:

var q = from c in nw.Customer select c;

but this one isn't:

var q = (from c in nw.Customer select c.Country).Contains("USA");

That last one is executed immediately...

"The example that you gave is a good example.

There is no way to know where it is going to run, and that is a bad mojo.

This means that you have to load the ENTIRE TABLE to memory to do this.

Crazy, crazy, crazy."

True, that's the problem. However IGNORING this isn't helping as people will want to execute it in the projection.

The trick is that you should have a generic way to map a call onto a database function. If such a mapping isn't found, you keep the call around. When handling the projection, you have support for Call and MemberAccess, all other places you don't. It then ends up in tears in an exception, what you want in this case. This still isn't fail proof though, a nested select in a join branch for example with a method call in the projection will cause problems.

"And I built my own primitive visitor for that.

I think that the fact that they didn't provide an expression visitor is a flat out shame. It is not like anyone else needs that, right?"

Matt Warren made one available on his blog: http://blogs.msdn.com/mattwar

It's the same as the one inside the .NET framework (which is internal. Joy..)

It has some flaky routines though, so you better write your own (it's fairly straightforward). I peeked into your code this morning and I saw you use a different approach than I do: you try to handle everything at once instead of re-writing the tree element by element. This is cumbersome, as with joins for example (groupjoin etc.) you need to refer to parts of the tree already processed, so calling out into different handlers isn't going to cut it: you need one big handler to merge everything together (which handles a tree which is pre-processed a couple of times by rewriting elements.)

@Mats: I meant a method which the visitor calls by passing itself to it :). Yes, abstract is fine, virtual doesn't make sense indeed, as you have to override it in all cases indeed.

"Could you elaborate"

var q = from c in nw.Customers

select new Foo(c.Country, c.City).GetSomeValue();

I'm not completely done with this scenario, my 'new' handler finds this a projection to new Foo instances, which isn't the case: it's a list of resultvalues from GetSomeValue(). I'm not sure if this is doable though, as it's pretty tough to distinguish if it's a list of Foo's, or a list of resultvalues from GetSomeValue().

I don't think it's a common scenario, but it illustrates the point. ;). (Haven't tried it if linq to sql can handle this though ;))

Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case.

Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly.

"It then ends up in tears in an exception, what you want in this case."

Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?

"Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"

There are two things going on there. The first is the technical feasibility of this. The second is the gross violation of the principal of least surprise.

Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case."

The problem is that the query feeds data to a delegate which is executed on the raw resultset coming from the db and the RESULT of that delegate is the result value for each row.

EVERY linq provider has to implement this scenario, otherwise the query you tested won't work at all, you'll get a crash somewhere, as the methodcall to GetSomeValues is inside the expression tree. You can't ignore it, you've to implement code to execute it.

So it's:

generate SQL to produce the input values for the in-memory delegate you're going to execute in the projection engine

"Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly."

No it's not! It's a DB query! :) It results in something like:

SELECT CASE WHEN NOT EXISTS (.... ) THEN 1 ELSE 0 END FROM <...>

NONE of the queries I posted executes ANY linq to objects code. None. That's the hard part of writing a linq provider: you get an expression tree, you have to convert EVERY bit to sql, otherwise the WHOLE query will fail.

An exception is the stuff which can be converted to in-memory code, like:

here, the new string[] { "USA", "Germany"}.Where(x=>x.StartsWith("U") part is an in-memory construct. You can find these with a funcletizer (do a google search, you'll find the 3 entries about it and the code) and compile it into a delegate.

Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"

Good luck with that. It can't be done. The problem is: you need results of the in-memory query INSIDE the db! Check:

you need the in-memory query result back in the db. You can imagine that it's possible to create a query where you need to pass back-forth multiple times resultsets to be able to produce the results (if applicable at all). This is not doable.

This is the weak side of Linq: the developer can tie things together which actually can't be tied together. In Linq to objects I can group on boolean expression results, in the DB I can't. So the same linq query can't run on the DB. For a developer it's not obvious why this is.