Rick Strahl's Web Log

Several people have been flogging me for my LINQ and disconnected Entity post, so I thought I should follow up on the issues I've been having. First let me restate Entity issue I raised a couple of days ago, but let me present it in a simpler context with the Northwind database, so you can try it and experiment with this stuff yourself and those who keep telling me to RTFM can post a concrete reply <g>.

The issue is this: I want to basically create an instance of an entity through LINQ then disconnect from LINQ's data context. Basically remove the object from change tracking. Then I want to reattach it later to the context and have it appropriately update the database. Now in the original example I used my own sample data and no matter what I tried it didn't work. Period. I could not get updates to work (and in fact it still doesn't work with my data) but I did get it to work with Northwind. But the experience is still not what I would call intuitive.

Ok, so the following code is an aproximation of a very simple disconnected operation. I load an entity and disconnect the data context and then reattach it to a new data context:

// ... do other stuff - Web Service, deserialize whatever to reload entity// and modify the data in the entity

cust.Address = "Obere Str. 57 " + DateTime.Now.ToShortTimeString();

// *** We now have a disconnected entity - simulate new context

NorthwindDataContext context2 = newNorthwindDataContext();

context2.Customers.Attach(cust,true);

context2.SubmitChanges();

Now, if I do this out of the box with raw Northwind database the above code fails. As was pointed out by several people you can't do the above because LINQ is essentially a connected model (not in terms of connections, but in terms of data context instancing) because LINQ to SQL requires the change state. As it sits above the code will fail with:

An entity can only be attached as modified without original state if it declares a version member or does not have an update check policy.

So, as mentioned in the last post if there's no timestamp member or other 'versioning' mechanism provided there's apparently no way for LINQ to determine whether that data has changed.

Now to get this to work I can do the following:

Add a TimeStamp field to the Customers table in the database

Mark the TimeStamp field as Time Stamp

Once the timestamp is in place, .Attach(cust,true) works. Take a look at the SQL generated:

So at this point you can see LINQ is updating every field which is to be expected given that it has no change information. Note that I have to use Attach(cust,true) to get this to work where true indicates that there are changes in the entity. If you pass just the entity the entity is attached only but the changes that might exist are invisible - only explicit changes you make after Attach will update.

Ok, that works, but it requires a TimeStamp field for every table. So a timestamp requirement may not be realistic. Are there other options?

There's another overload that supports passing .Attach() with the second object instance that is supposed to hold the original state. Now this makes some sense - you can basically tell LINQ to attach and then compare the object state against an existing instance and based on that update the change state.

So what I unsuccessfully tried in my previous post is code like the following:

I load up a second instance from the context and use that as a comparison. But this code fails with:

Cannot add an entity with a key that is already in use.

The problem here is that the context can only track one instance of this object. Because the instance already exists from the Single load (based on the PK I presume), the Attach() fails. The Attach tried to synch finds that there's already another object in the context with the same PK and it fails.

So how are you supposed to get a current instance to compare against? You can do this only by either hanging on to an older instance or - more likely - by using a separate data context:

And that works and performs the update.But man is that UGLY. It takes a second data context, a separate SQL statement to retrieve existing state and the Sql for the update this is pretty massive on top of it:

It includes a WHERE clause that compares every field which is pretty much the case even when you don't do 'detached/attached' updates.

So my question is why is this necessary? First off note that we are already sending ALL the data to the server for the WHERE clause. So the concurrency check is already occurring anway so no worries there. Why not just update all fields at this point? If there are differences they would be detected by the WHERE clause in the first place. This whole update mechanism and Attach seems completely redundant in light of the SQL already going to the server.

Another oddity here: If you use the Attach(cust,cust2) if the table has a TimeStamp field, SubmitChanges also fails with:

Value of member 'TimeStamp' of an object of type 'Customer' changed.A member that is computed or generated by the database cannot be changed.

I suspect that's a bug in the way Attach updates the entity from the 'original state' and is inadvertantly updating the TimeStamp field. This happens inside of the LINQ code - the Update never hits the SQL backend.

So there you have it. It works - but if you ask me the process is about as clear as mud.

As an aside - the massive DataAdapter 1.x like SQL generated above applies only to these detached updates. It looks like if you do direct change tracking on on the object (ie. no detach/reattach) the Sql looks a bit better. This is an update WITH a timestamp field:

You will have to create a class for DataObjectTypeName but you will do it anyway because of other 'problems' that you have (returning a 'var' form BLL). Or for smaller projects you can use the entity classes constructed by Visual Studio ORM.

This better be a bug or an omission that will be fixed before RTM. If I can't easily predict what it will do, and do funky stuff like what you're shown then I'll never use it, and I imagine lots of other people won't either. Call me crazy but a little bit of type safety for SQL, in places where you don't have to bypass LINQ to SQL, is not worth me not being to tell what it's doing behind the scenes... just my opinion.

I completely agree with PBZ on this issue. Linq To Sql (as it currently stands) seems to be similar to DataAdapters in that results can be downright unpredictable and when dealing with data of any value, there is simply too much risk involved.

Adding a new timestamp attribute to every table? Are you kidding me?

I'd also like to hear the reasoning behind that creepy update statement. It's a neon sign blinking "I'm unstable."

Hate to get on the "hate wagon" here, but as I said in another comment, does the "nail" LINQ is trying to hammer really exist? Do we need this done in this way?

Disconnected updates are very useful. In the past have written code (albeit it was 1.1) that queries a user's own data, keeps it in a dataset in Session. The user can update, add, delete rows -- and when they are done the the dataset is reconnected and updates in the datatable are pushed to the server, rather than hitting the back end for every change as it happens (akin to the old ADO "batch update" recordset method).

This works fine because only 1 user can update these rows. A Timestamp is one way to test for "freshness" when updating rows that anyone can touch. Having a Timestamp field in (almost) every table is quite the norm in many situations.

I may be off base, but LINQ seems to tie data activity syntax more closely to the development language for the sake of working with data thru objects (?).

@bzburns - I think it's important to understand no matter what type of tool you use to automatically generate SQL for you it like will never generate the same SQL you would like it to - in most cases generating sub-optimal code in order to be generic and safe. This is pretty much true of any ORM tool and business object toolkit. The thing that's scary about LINQ to SQL is that you don't get any sort of override. You can't - using LINQ TO SQL - provide your own SQL expressions to optimize or even create your own queries by hand and submit them. LINQ to SQL simply lacks the facilities. So the only alternative to tweaking SQL is to completely work around LINQ to SQL and go the raw ADO.NET route.

@Steve - I think there's definitely a place for a solid ORM solution to address CRUD data access. I've been using a home grown business framework and I haven't written any CRUD code in years and it reduces the amount of code I have to write significantly without a perf hit. CRUD data access is typically small amounts of data so optimization at this level is not as crucial as for queries. If it's done right - yeah I think there's definitely a spot for this sort of framework. Several ORM solutions out there today are close to this ideal already but the missing piece has always been an effect query layer and that's where the LINQ comes in and where a lot of opportunity lies. I bet one of the third party solutions will nail this before Microsoft does.

Thanks Rick. You've just summed up 1 of 2 issues (many-to-many mappings are rather odd. And don't even try and add a disconnected entity to a many-to-many relationship) that hold LINQ back.

I had a play, started creating a dummy app and was stumped. There's no way round it with the current LINQ implementation.

Incidentally, do we know the cost of holding DataContext in the app. i.e. the connected model ? I'm assuming no connections are held open. It would at least be nice to know the cost incurred.

Please keep it up. If there's a way round it I'd be intrigued, otherise LINQ will need a huge disclaimer - ConnectedLINQ perhaps. Seems like a high cost, especially when compared to the other frameworks in this space.

Again, I understand what you are wanting, I agree, I would want it as well, I'm not trying to be a 'stick in the mud' - but the fact is if you are really needing a tried and true ORM solution with a proven track record, I believe NHibernate is a better option until LINQ over SQL matures.

My concern is that the issues brought up here were brought up quite some time ago, given responses to by Matt in MS forums, but I haven't really seen it delivered yet.

Last thing, I'm glad to see MS finally showing ORM as a legitimate solution, I hope it helps other ORM's - as I develop using a Domain model architect, and this will be further proof that 'even MS uses it' - lol (sad but true)!

Steve - agreed. Really I don't have a beef with LINQ per se - it's LINQ to SQL that's the emperor without clothes IMHO... the power of LINQ as a language construct is tremendously helpful and flexible in ways that we probably haven't even begun to discover. It's sad that it looks like the power of LINQ with a DAL is probably going to be shown off by a third party and not Microsoft. Ironic ain't it?

I have no argument with third party solutions - in fact I'm looking at various things at the moment as I'm relooking to build my internal tools once 3.5 releases. I've built my own data layer way back in .NET 1.0 when I was still struggling with .NET. It worked out surprisingly well and the toolset has matured over the years, but there's always this nagging feeling that it's a hack - as it is. It's not consistent - as most home grown solutions are. It works very well for me, but frankly I wouldn't want to force my model onto developers as a generic solution. As was pointed out many times before building a solid ORM based DAL is a complex thing. The key is balancing complexity, flexibility, performance and usability.

I have no argument with third party tools and I expect them actually to be more mature and in many ways more solid that what Microsoft cooks up. However, I sure would like to see a solid solution in the box so at least you have a good baseline to start from even without having to go the 3rd party route.

As somebody who builds tools that redistribute and must include data access code as part of apps it drastically complicates matters to rely on anything third party as part of a solution. I guess it's not going to happen - not in this first release.

From the outside without looking at LINQ to SQL seriously and trying to apply it - it looks very promising. But the roadblocks you run in almost right out of the starting gate seem very jarring - like did Microsoft not see these scenarios coming? Seriously if they missed the disconnected scenario WTF were they thinking? Have we really degraded to apply .NET technology just to fit the drag and drop model and proper application design be screwed?

@Oleg - yes in light of what I've seen I've thought about Reflection myself <g>. But the whole point of an entity layer is that it should handle this for you and hopefully without the overhead of Reflection.

However, if you uncomment the following line: Console.WriteLine(product.Category.CategoryName) which lazy loads the Category entity, some unexpected things happen. Calling SubmitChanges now causes a new category record to be created in the database and the product record is updated with the new category id.

It doesn’t seem like simply calling into the Category property should change the entity update behavior, but unfortunately it does. As it stands, this has serious implications.

You claim that you can not override the update implementation and handle it yourself. Sure you can, if you implement the partial UpdateT, InsertT, DeleteT, you can do whatever you want when the DataContext evaluates the item to be changed. The typical case here is using these methods to supply stored procedures as the access mechanism rather than using the runtime behavior.

I suspect some of the unexpected behavior you are seeing with the native update without the time stamp is due to not resetting the UpdateCheck flag on your objects after removing the time stamp. The problem you will run into when not using the timestamp is the fact that when you attach your object back to the context using the .Attach method or manually as Oleg did, you are now comparing the returned values with the current ones in the database. You are not checking for concurrency against the values that were in the database when the record was originally fetched and disconnected. To do this, you need to either 1) Use a timestamp, 2) Have the database generate a hash of the original values, store that and regenerate it from the current database values when you return, or 3) Have your object persist a copy of itself with the original state which you then use to generate your concurrency check back to the database.

Working with LINQ to SQL disconnected entails many of the same issues that working with disconnected ADO does. You will need to worry about many of the same things as you return. You won't need to worry about wiring up your database code manually.

@Andrew - you can always use ExecuteQuery() as long as you can properly project it into a type that exists. IOW whatever SQL result you create has to match the type that you pass in as the generic type parameter.

ExecuteQuery() will let us work around most issues, but of course it also defeats most of the strong typing benefits of LINQ to SQL.

What I was referring to in the comment above was the actual expressions that are supported in querying or ordering/grouping. Say you want to expose a custom function for calculation or even something that LINQ doesn't support. At that point you HAVE TO pretty much revert to using ExecuteQuery OR creating expressions yourself (which is decidedly complex and cryptic).

I have a couple of other posts that specifically talk about the dynamic code scenarios. My conclusion there is basically that there are scenarios where ExecuteQuery will be required and that I'm glad that there's an 'escape hatch' to get to the lower level if necessary.

As I read, the DataContext is not a lightweight object and therefore it should not be instanciated too often. What I tried and what worked well form me was to instanciation a DataContext and keep it in a session object. So you can reuse it to write the data back to the database. This works quite fine but it does not scale very good. So I agree with you, that there is some work left to be done on the LINQ. But anyway you cannot compare it to nHibernate. I think LINQ is a much better approach and this will be the future for not too complex scenarios. The complexer scenarios will be covered by ADO.Net vNext which will give you some enhanced features, but the big advantage of link is it's simple usage. And I found on other O/R Mapper which you can use that simple and that quick.

As to heavy or not, I suppose that depends on perspective. What I did in my business object layer is support both connected and disconnected operation. If you call say the Save() method with no parameter it assumes you're connected and saves changes. If you pass in an entity it assumes you're detached and if so creates a new context to do the update. That way you get to choose on the situation.

However, persisting datacontext across requests in ASP.NET is surely a really bad idea for many reasons. Scalability, Session restarts lots of issues there.

Maybe it is just me, but I wonder if the problem is not the datacontext, but maybe the entities. This might be a bit of a wild idea, but what about saving an copy of the entity within the entity which contains the original values. State of what has changed would be automatically tracked within the object (still providing the change events). Although this method would have a bigger footprint, it would be nice to pass around entity objects in a disconnected method and then be able to track just what changes has occured to a given entity. You may wish use those "change stats" in code not even releated to the database end.

I just got through testing out adding an instance variable and property to the generated entities and then capturing the original state on the OnLoaded() method along with supplying a Clone method to generate a detached entity instance.

Seems to work great, it has a copy of the original state and is completely detached. When I attach later on, I use the method that allows original state such as .attach(myEntity, myEntity.Original).

Too bad they didn't address this issue properly. I can see great potential in LINQ in general but there are too many loose ends such as disconnected entities, and another thing that REALY REALY bothers me is that there's no way (at least I haven't found one) of converting a LINQ qurey result into a DataTable or DataView. In our organization we have many Custom/User controls which recieve a DataTable as a data source and do manipulations on it, as of now if we want to meve on and use LINQ they are completeley useless.

I'm guessing there will be some kind of service pack or a new FrameWork coming out sooner than we all can imagine. I've been reading around for the past couple of days and the web is exploding with coders who are unsatisfied with LINQ.

Hi I used to have a DBFacade that would handle my update But now i'm getting an error'An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.'

I have a solution that works for me and is clear as day. If somebody else has mentioned this I apologize. I havent tested it in complex scenarios but I know it works in basic scenarios and based on the way it works I can only assume that it, or something similar to it, will work for anything. I posted about it here. http://forums.microsoft.com/msdn/ShowPost.aspx?postid=2524396&siteid=1

@Steve - I took a look at your post but I'm not quite sure what you're doing. If I understand this correctly you're making your Pk marked as IsVersion? What else? I gave this a shot for kicks but it didn't work for me - still get the same attach error.

Also I would think using your PK and IsVersion in the same field isn't going to work well. If you update a record and the PK doesn't change. L2S isn't going to detect the record as updated.

You post is a little vagues, so can you clarify what else there is?

Just so we're on the same page, when you break your code down to its core you are doing this right:

Rick,You seem to have a slightly different scenario than the one I posted about. In my post, the advantage that I have is that I am essentially using the actual .aspx page as a layer of abstraction. Relating your code to mine.... in my Page_Load I would do this...

and then I would assign the entry values to controls on my .aspx page such as txtEntry.Text = entry.InvoicePk.ToString();

Now once the Page_Load is done I don't have to worry about clearing out the context because it's now out of scope and gone. I can edit the data on the .aspx page now and press a submit button. On submit I would have this...

I don't think your concern about using the PK as IsVersion is a problem, at least not in my scenario. Because when you call Attach and pass in an entry that was never attached to anything as the first param and the boolean true as the second param you are telling the attach method that the entry Entity is a modified version of what already exists in the database. I'd have to double check the sql that gets written but I'm pretty sure LINQ is going to just do a comparison between the entryId in your entity and the entryId in the database and of course it will find a match. It will then compare the properties between your entry and the database entry and it doesnt care about the PK at that point because it is only used in the Where clause.I hope this is more clear.

@Steve - Duh - I missed that you create a new entity. So you're creating a new entity, assigning a pre-existing PK and then assuming that to be fully updateable. Hmmm.. sure that works, but that assumes that you're actually updating everything. If you leave fields blank won't that wipe out existing fields when you call .Attach(entry,true) which marks all fields as udated effectively? Also have you tried this with related entities? .Attach works reasonably well with flat entities but once you have related data it often falls flat.

And using Attach() is expensive both in terms of the SQL generated as all fields are updated and the .NET processing that has to synch up the entities...

To be honest in Web applications and page scenarios LINQ to SQL's DataContext management isn't really a problem. There are lots of ways to deal with the lifetime of the context fairly effectively mainly because pages are transient and create new contexts for each load.

The above scenario can be addressed in many different ways. I tend to load the entity I'm updating first by its PK, then write the values from the POST back into the entity and then simply submitchanges. Logically that seems to make more sense to me. I use business objects to provide the entities and hold the context for me but the basic gist of it in raw LINQ to SQL code can be boiled down to this for an unbinding of values in say a Save button submit:

FWIW, this model jives much better how LINQ to SQL wants to work than using entities that are passed to methods like Update and Save. I've talked about creating a business object wrapper around LINQ to SQL that provides high level business methods but still works in the context of how LINQ to SQL wants to work.

Excelent post!I have a big question.I have a DataLayer using Linq, a Business Layer and in the UI I have a formview using the objectdatasource to edit the data.The problem is in the Business layer I created a update method with the Entity as parameter. When I click the Update method, the Entity object is empty. I just don't know how to use a context.

For what it's worth I've been using a 'playback' pattern to solve this across layers. Rather than attempt to clone state or use reflection to keep around the original entity, I update the entity on the client through an Action delegate containing the changes, and send the whole Action down to the data layer. Then on the data layer all I need to do is fetch the original entity and 'play' the delegate against it on the fresh data context and then submit my changes. Even though the changes came from the client, they are performed on the server. Sure, you pay a tax since you have to retrieve the entity twice (one when you fetched from the server, and again when you fetch the original to play the update), but it's much less than cloning or reflection.

Is the LINQ to SQL ORM fundamentally different than the battle-tested ORMs that have been in widespread use for a decade or more? It does not appear to be (they had many freely available models complete with source to borrow from) so why would "unmanaged" or "dettached" objects be different?

The context tracks state changes and if you have changes across requests then place the context in the conversation (e.g., an HTTP session context object). Alternatively fetch the object and update based on your own comparisons of the incoming "detached" object. This really does appear to justify a RTFM solution.

Honestly, how hard is it to force an update via the primary key of an object for disconnected entities, with a bool to check for concurrency, a lot of the time most people aren't concerned with concurrency.

Also timestamp field is to be deprecated from mssql in 2008, so wouldn't suggest using this type of DB datatype in the future.

What I don't understand is that the classes generated are made to serialize over remoting like WCF but when you return an object from the client and try to update the object you will get these errors.

Simplest way I've decided is to write your own update method in a wrapper/facade class that talks to linq for doing all the updates, gets, etc.. and connecting to the database directly using the datacontext's ExecuteCommand method..

@jean pierre - Er, it's all SQL. Let's not forget that LINQ to SQL is just a SQL generator. In certain situations string sql queries are the only way to do certain things (especially with LINQ to SQL) or at least to do things WAY more efficiently than L2S does natively. For example, try to delete or update a group of items in batch based on a query expression - sure you can load up the entities and delete them all then save them all back, but that's hardly efficient. And there are many situations - especially in the generic tool level business layer - where dynamic queries are much more suitable than LINQ based queries.

No doubt application level code should use the model as much as possible but I feel a heck of a lot better knowing that if I have to I can fall back on string based queries to do what I need to get the data. L2S (and any model only framework) has gotten me into spots where it was painful to do something with LINQ/Model and it's easier to do it 'by hand'.

Customer customer;
Order order;
using (var context = new SomeDataContext()) {
// Just get the first, doesn't matter when playing!
customer = context.Customers.First();
}
order = new Order();
order.Reference = "OrderRef";
order.Customer = customer;
using (var context = new SomeDataContext()) {
// Doesn't work if you use InsertOnSubmit(), or any other Attach() overload.
context.Orders.Attach(order);
context.SubmitChanges();
}

Then you've got an order without it copying the customer! In all fairness you might need a TimeStamp column though. Let me know if it works for you?

Realise this thread's a bit old, but I created a completely generic functional implementation of this that uses reflection and specifically checks for the ColumnAttribute att on each DB property of the LinqSql Entity ->