Solving the detached many-to-many problem with the Entity Framework 12

Introduction

This article is part of the ongoing series I’ve been writing recently, but can be read as a standalone article. I’m going to do a better job of integrating the changes documented here into the ongoing solution I’ve been building.

However, considering how much time and effort I put into solving this issue, I’ve decided to document the approach independently in case it is of use to others in the interim.

Note: There have been some minor edits to the ApplyStateChanges function to address some issues. I’ve left the original code here so you can see the changes.

Note that the next article will have a more refined solution.

The Problem Defined

This issue presents itself when you are dealing with disconnected/detached Entity Framework POCO objects,. as the DbContext doesn’t track changes to entities. Specifically, trouble occurs with entities participating in a many-to-many relationship, where the EF has hidden a “join table” from the model itself.

The problem with detached entities is that the data context has no way of knowing what changes have been made to an object graph, without fetching the data from the data store and doing an entity-by-entity comparison – and that assuming it’s possible to fetch the same way as it was originally.

In this solution, all the entities are detached, don’t use proxy types and are designed to move between WCF service boundaries.

Some Inspiration

There are no out-of-the-box solutions that I’m aware of which can process POCO object graphs that are detached.

I did find an interesting solution called GraphDiff which is available from github and also as a NuGet package, but it didn’t work with the latest RC version of the Entity Framework (v6).

I also found a very comprehensive article on how to implement a generic repository pattern with the Entity Framework, but it was unable to handle detached many-to-many relationships. In any case, I highly recommend a read of this article, it was inspiration for some of the approach I’ve ended up taking with my own design.

The Approach

This morning I put together a simple data model with the relationships that I wanted to support with detached entities. I’ve attached the solution with a sample schema and test data at the bottom of this article. If you prefer to open and play with it, be sue to add the Entity Framework (v6 RC) via NuGet, I’ve omitted it for file size and licensing reasons).

Here’s a logical view of the model I wanted to support:

Here’s the schema view from SQL Server:

Here’s the Entity Model which is generated from the above SQL schema:

In the spirit of punching myself in the head, I’ve elected to have one table implement an identity specification (meaning the underlying schema allocated PK ID values) whereas the other two tables the ID must be specified.

Theoretically, if I can handle the entity types in a generic fashion, then this solution can scale out to larger and more complex models.

The scenarios I’m specifically looking to solve in this solution with detached object graphs are as follows:

Add a relationship (many-to-many)

Add a relationship (FK-based)

Update a related entity (many-to-many)

Update a related entity (FK-based)

Remove a relationship (many-to-many)

Remove a relationship (FK-based)

Per the above, here’s the scenarios within the context of the above data model:

Add a new Secondary entity to a Primary entity

Add an Other entity to a Secondary entity

Update a Secondary entity by updating a Primary entity

Update an Other entity from a Secondary entity (or Primary entity)

Remove (but not delete!) a Secondary entity from a Primary entity

Remove (but not delete) a Other entity from a Secondary entity

Establishing Test Data

Just to give myself a baseline, the data model is populated (by default) with the following data. This gives us some “existing entities” to query and modify.

More work for the consumer

Although I tried my best, I couldn’t come to a design which didn’t require the consuming client to do slightly more work to enable this to work properly. Unfortunately the best place for change tracking to occur with disconnected entities is with the layer making changes – be it a business layer or something downstream.

To this effect, entities will need to implement a property which reflects the state of the entity (added, modified, deleted etc.). For the object graph to be updated/managed successfully, the consumer of the entities needs to set the entity state properly. This isn’t at all as bad as it sounds, but it’s not nothing.

Establishing some Scaffolding

After generating the data model, the first thing to be done is ensure each entity derives from the same base class. (“EntityBase”) this is used later to establish the active state of an entity when it needs to be processed. I’ve also created an enum (“ObjectState”) which is a property of the base class and a helper function which maps ObjectState to an EF EntityState. In case this isn’t clear, here’s a class view:

Constructing Data Access

To ensure that the usage is consistent, I’ve defined a single Data Access class, mainly to establish the pattern for handling detached object graphs. I can’t stress enough that this is not intended as a guide to an appropriate way to structure your data access – I’ll be updating my ongoing series of articles to go into more detail – this is only to articulate a design approach to handling detached object graphs.

Having said all that, here’s a look at my “DataAccessor” class, which can be used with generic data access entities (by way of generics):

As with my ongoing project, the Entity Framework DbContext is instantiated by this class on construction, and implements IDisposable to ensure the DbContext is disposed properly upon construction. Here’s the constructor showing the EF configuration options I’m using:

Updating an Entity

We start with a basic scenario to ensure that the scaffolding has been implemented properly. The scenario is to query for a Primary entity and then change a property and update the entity in the data store.

//set a breakpoint here to see the result in the DB
b.InsertOrUpdate<Primary>(existing);
}
//return the values to the original ones
existing.Description = existingValue;
other.AlternateDescription = existingOtherValue;
existing.State = ObjectState.Modified;
other.State = ObjectState.Modified;
using (DataAccessor c = new DataAccessor())
{
//update the entities back to normal //set a breakpoint here to see the data before it reverts back
c.InsertOrUpdate<Primary>(existing);
}
}

If we actually run this unit test and set the breakpoints accordingly, you’ll see the following in the database:

Database at Breakpoint #1 / Database at Breakpoint #2

Database when Unit Test completes

You’ll notice at the second breakpoint that the description of the first entities have both been updated.

Examining the Insert/Update code

The function exposed by the “data access” class really just passes through to another private function which does the heavy lifting. This is mainly in case we need to reuse the logic, since it essentially processes state action on attached entities.

Note: Edited Version

I’ve been doing some testing with some more complex scenarios, and made the following changes – these aren’t reflected in the solution attached at the end of this article, but will be discussed in the next article.

The changes allow for existing entities to have been attached already, and also draw a distinction between added, modified and deleted entity states.

Notes on this implementation

What this function does is to iterate through the items to be examined, attach them to the current Data Context (which also attaches their children), act on each item accordingly (add/update/remove) and then process new entities which have been added to the Data Context’s change tracker.

For each newly “discovered” entity (and ignoring entities which are unchanged or have already been examined), each entity’s DbEntityEntry is set according to the entity’s ObjectState (which is set by the calling client). Doing this allows the Entity Framework to understand what actions it needs to perform on the entities when SaveChanges() is invoked later.

You’ll also note that I set the entity’s state to “Processed” when it has been examined, so we don’t act on it more than once (for performance purposes).

Fun note: the AddOrUpdate extension method is something I found in the System.Data.Entity.Migrations namespace and it acts as an ‘Upsert’ operation, inserting or updating entities depending on whether they exist or not already. Bonus!

That’s it for adding and updating, believe it or not.

Corresponding Unit Test

The following unit test establishes the creation of a new many-to-many entity, it is then removed (by relationship) and then finally deleted altogether from the database:

SQL Profile Trace

Removing a many-to-many Relationship

Now this is where it gets tricky. I’d like to have something a little more polished, but the best I have come up with to date is a separate operation on the data provider which exposes functionality akin to “remove relationship”.

The fundamental problem with how the EF POCO entities work without any modifications, is when they are detached, to remove a many-to-many relationship, the relationship to be removed is physically removed from the collection.

When the object graph is sent back for processing, there’s a missing related entity, and the service or data context would have to make an assumption that the omission was on purpose, not to mention that it would have to compare against data currently in the data store.

To make this easier, I’ve implemented a function called “RemoveEnttiies” which alters the relationship between the parent and the child/children. The one bug catch is that you need to specify the navigation property or collection, which might make it slightly undesirable to implement generically. In any case, I’ve provided two options – with the navigation property as a string parameter or as a LINQ expression – they both do the same thing.

Notes on this implementation

The “ToObjectContext” is an extension method, and is akin to (DataContext as IObjectContextAdapter).ObjectContext. This is to expose a more fundamental part of the Entity Framework’s object model. We need this level of access to get to the functionality which controls relationships.

For each child to be removed (note: not deleted from the physical database), we nominate the parent object, the child, the navigation property (collection) and the nature of the relationship change (delete).

Note that this will NOT WORK for Foreign Key defined relationships – more on that below.

To delete entities which have active relationships, you’ll need to drop the relationship before attempting to delete or else you’ll have data integrity/referential integrity errors, unless you have accounted for cascading deletion (which I haven’t).

Removing FK Relationships

As mentioned above, you can’t just edit the relationship to remove an FK-based relationship. Instead, you have to follow the EF practice of setting the FK entity to NULL. Here’s a Unit Test which demonstrates how this is achieved:

This is a fairly expensive operation which is why it’s pretty much reserved for deletes and not more frequent operations. It essentially determines the target entity’s primary key and then checks whether the entity exists or not.

Note: I haven’t tested this on entities with surrogate keys, but I’ll get to it at some point. If you have surrogate key tables, you can define the PK key order using attributes on the model entity, but I haven’t done this (yet).

Summary

This article is the culmination of about two days of heavy analysis and investigation. I’ve got a whole lot more to contribute on this topic, but for now, I felt it was worthy enough to post as-is. What you’ve got here is still incredibly rough, and I haven’t done nearly enough testing.

To be honest, I was quite excited by the initial results, which is why I decided to write this post. there’s an incredibly good chance that I’ve missed something in the design and implementation, so please be aware of that. I’ll be continuing to refine this approach in my main series of articles with much cleaner implementation.

In the meantime though, if any of this helps anyone out there struggling with detached entities, I hope it helps. There’s precious few articles and samples that are up to date, and very few that seem to work. This is provided without any warranty of any kind!

If you find any issues please e-mail me rob.sanders@sanderstechnology.com and I’ll attempt to refactor/debug and find ways around some of the inherent limitations. In the meantime, there are a few helpful links I’ve come across in my travels on the WWW. See below.

About Rob Sanders

IT Professional and TOGAF 9 certified Enterprise Architect with nearly two decades of industry experience, 18 years in commercial software development and 11 years in IT consulting. Check out the
"About Rob" page for more information.

Hi Emad,
Do you have a sample solution I could take a look at? ModifyRelatedEntities relates to 0..1:Many rather than Many:Many so it depends entirely on your schema.
Unfortunately, I didn’t have time to test all potential relationship scenarios, if you could give me details about your schema I’d be happy to take a look.
Best,
Rob

So I changed the PK of the Secondary table to Identity, updated the model and re-ran the test. It failed, but this time it seems it’s because the Secondary entities default to a PK value of 0. As strange as this might sound, if you assign the entities unique PK values (they are treated as placeholders) and are assigned proper identity values when they are committed/saved. I’ve tested and this works, believe it or not!
Reference: http://stackoverflow.com/questions/11602683/error-seeding-database-foreign-key-issue

Thank you so much for sharing this excellent article. I’d been struggling to make sense of detached object graphs(POCOs) and getting them back into the context correctly. You absolutely solved all the problems I had with this insightful and informative post!

I’ve been using the information presented in this article to great effect, so thank you for taking the time to work all of this out.

I’ve been thinking about the issue regarding removing/altering M-M relationships and your remark that the relationship has to be explicitly specified. It seems to me that this issue can be resolved with a bit of reflection and a simple expression tree.

You can retrieve the list of M-M navigation properties like so (Modified from something i found here)

Hi Justin,
That’s excellent, many thanks. With my current client, we’re using Insight.DB, but I still maintain the generic implementation for my own projects.
I’ll look at integrating and see how it goes. How do you find the performance, given the incorporation of Reflection?
Cheers,
Rob

Thanks for the great article. I have developed a similar framework but I am facing an issue regarding M:M scenario. My scenario is that I need to add records in M:M table but with existing records from Primary and Secondary. In your code in the function ‘InsertManyToManyThenDelete’ I made a simple change that instead of creating a new Secondary I pull one from DB and add it to Primary but it does not work.

Post navigation

Stuff to cover the hosting costs

Disclaimer

Privacy: Ads are generated on this site. Google may collect cookies about your interests to make ads more relevant. To Opt out or find more information, see this blog or also see the Google Privacy Center