Working With Large Models In Entity Framework – Part 2

In my last post I talked about some of the issues you typically face when using a large Entity model in your application. I have also described a few things that you can use to mitigate some of these problems. In this post I will walk through a couple of examples to demonstrate how you can split one large entity model into smaller ones while reusing types thus avoiding duplication.

Sub-dividing the model into smaller models with type reuse

The concept of “Using” in CSDL schema will allow you to do this. This is a pretty powerful feature that will enable you to create multiple models and let you reuse types in different models.

There are a few problems with using this approach which include:

a)No designer support: The designer does not support “Using”. So you would have to create your model first using designer and then edit the xml to use Entities from a related model.

b) Bi-Directional Navigation Not Supported: Since you cannot create cycles when creating model dependencies, you can declare navigation properties only in one model.

But even with these short comings, dividing the model into smaller models will make sense in a lot of cases. There are two ways you can sub-divide your model.

a. Multiple CSDL files( Models) while sharing MSL and SSDL:

The advantage with this approach is that the object model for your types is much cleaner. You don’t have the problem of having 1000 types in a single namespace. But it won’t solve the problem of performance or Intellisense.

How to create multiple CSDL files that will share MSL and SSDL:

The example uses Northwind sample database.

1. Create a new Ado.Net Entity Data Model using the Entity Data Model Wizard by pointing to Northwind database and choosing Products, Categories and Suppliers tables.

2. Change the “Metadata Artifact Processing” property of Edmx file to “Copy to Output Directory” directory and build the solution. This will drop the CSDL, SSDL and MSL files in the build output path.

3. Copy the schema files (CSDL, SSDL and MSL) to another location. This location will be used in the Metadata parameter of the EntityConnection string. Let’s call these files – model1.csdl, model.ssdl, model.msl

4. Open the CSDL file and copy the Cateogories Entity Type to a separate CSDL file. Let’s call this file model2.csdl. Use a different namespace for this schema. Let’s say this is NorthwindModelBase.

In the previous section, I described how to break your CSDL file into multiple CSDL files while still sharing the mapping. But as I mentioned earlier this would not solve the performance problems that could come up because of the size of the models. To solve the performance problems, you would have to actually map the database partially into different CSDL files.

Here are the things you need to remember when using this approach:

1. There could be cases where you might have to map the same table to two different models. So you would have some duplicate metadata lying around.

2. There could also be cases where you would expose foreign keys as scalar properties because you do not want to pull in all the related tables into your Entity model.

3.SSDL and MSL does not have any concept of reuse currently, so either you can choose to reuse the types in CSDL( as described in previous section) or you could choose to duplicate information in CSDL too. Reusing the types definitely has some advantages but given the pain in creating CSDL schemas that import other schemas, you might want to consider duplicating information in CSDL too. This would allow you to work with the designer. But if you are dividing the model for performance and maintainability reasons and you actually want to use these smaller models in a single application, duplicating the information would not be a viable option. There are definitely other disadvantages with duplicating information across multiple model files( typically the same problems that you would see with duplicate code). The way to avoid duplication would be by using the “Using” element in CSDL. In the below steps, I have described how to do model splitting with the support of Using and no duplication in the model( CSDL ) files.

How to split single set of CSDL, SSDL and MSL files into multiple sets:

The example uses Northwind sample database.

1. We want to create an application that uses the following tables: Products, Categories, Orders, Order Details, Customers and CustomerDemographics.

2. Let’s say for reasons of performance and maintainability we want to split these into two different models with two different containers.

3. To do this, create 2 new Ado.Net Entity Data Models using the Entity Data Model Wizard by pointing to Northwind database. In one model, choose Products, Categories, Orders, Customers and Order Details tables. In the second set, choose Customers and CustomerDemographics. So you have included Customers table both in the first set and second set. Let’s refer to the first set of schemas as ProductDetails and second set of schemas as CustomerDetails.

4. You can either choose to reuse the Customers type by using the “Using” element in CSDL or repeat the same type in both the sets.

5. In my sample( shared below) I have chosen to move the Customers type to a separate model called CustomerBase.csdl and reuse the Customers type in both CustomerDetails model in ProductDetails model.

6. Change the Customers end in “FK_Customers_Orders” association in ProductDetails model to refer to Customers type defined in CustomerBase model.

7. You need to make a similar change to the CustomerCustomerDemo Association that relates Customers to CustomerDemogrpahics.

8. You can also see that there is a Navigation property on Orders type that you can use to Navigate to related Customer. Also observer that the Customer defined in the CustomerBase model does not have a navigation property to navigate back to related Orders. Since you are sharing Customers type between different models, you cannot add that navigation property. For example, if you add a navigation property for Orders on Customers type, it won’t make sense when you use Customers type in CustomerDetails model since Orders type is not present in that model.

9. At runtime, you could create either one Context that works with both the schema sets or two different contexts. To create a single context with both the schema sets, you would use the ObjectContext constructor that takes in an EntityConnectionString. In the Metadata parameter of the connection string, specify the paths to both sets of files.

10. Problems with Navigation properties being absent:

As mentioned above, in the ProductsModel there is a Navigation property on Orders type that you can use to Navigate to related Customer but you cannot navigate back to Orders from Customers because the NavigationProperty was not defined on the Customers type.

Here is some sample code that navigates from Orders to Customer and prints the name of Customer who ordered it. This is pretty simple since there is a Navigation property on Orders type that you can use to Navigate to related Customer.

Here is another sample to navigate from Customer to Orders and prints the OrderID of all the orders that a customer has ordered. Since there is no navigation property, you would have to go to the Orders collection and use the Customer navigation property on Orders.

Join the conversation

We appreciate your efforts addressing this issue. I think that alot of folks would like to see a much more simplified approach in the long term. My hope is that the Oslo repository has a key role to play here. Especially in terms of dividing the models up into sub domains, being able to reuse them among teams and the like. In fact would it be possible for your team to let us know how they see the relationship between EF , M, and the Oslo repository? Just know that we’re looking for something more than modifying the EDMX files by hand.

We have talked about quite a few things that would improve this process. But other than a few small things here and there, some of the big items that we want to do in this area like designer support for Using etc probably won’t make it into V2.

I did not completely follow the scenario you are describing but my guess is that you are talking about allowing users to split and/or reuse their SSDL and MSL files as we allow them to do in the CSDL. If so, this is something we talked about and something we want to enable in the future.

In your option "b", step 5 (where you create a new CSDL called CustomerBase), I am not understanding this.

First, I need to incorporate your second option (b) into my application. I’m splitting into 3 separate models as I have over 250 tables in my app, but I need to share a table called Account across two of the models. All 3 of these models will be referenced in the one application.

Here’s my confusion: When you create separate models, these models are EDMX files, not CSDL, SSDL and MSL files. I do understand that the EDMX is comprised of these 3 components. But your instructions say (in step 5) to add a CSDL file called CustomerBase. How do you do that? Are you really saying to add another MODEL called CustomerBase, modify the CSDL section of the CustomerBase.edmx, then add "using" in the other 2 models (CustmerDetails, ProductDetails)?

Please let me know as I am trying to do this in my application in order to solve the performance issues with my overly large model.

Having one .edmx file might be good as far as it could be easily edited both manually and via designer.

– For the manual editing improvements you could add one simple feature which will provide some sort of "navigation links". Thus when you edit some peace of xml code related to some table in SSDL part, VS will show you links (or previews) of other related peaces of xml in CSDL and MSL part.

– In entity designer we need an ability to group entities into aggrigates with different namespaces. (ex.: Membership.User, Membership.Role, Store.Product, Store.CreditCard etc.). That would be great if we could collaps/expand those "aggrigates". Zoom in/out int0 aggrigates view and into entities view.

Will any of the "large model" issues mentioned in Part I and Part II be addressed in the upcoming EF4 release? There is a lot of good information about some of the new features, but can you let us know if there is any movement towards making EF more friendly to large models? Or at least if there will be a less manual approach to splitting models in EF4?

My team is in the process of migrating an application with LOTS of tables. So far we only have ~80 tables in EF (out of 1,000+), but we will add more tables to the model as we add functionality. Unfortunately we couldn’t add all of the tables to EF due to performance problems and others issues mentioned in these blog posts.

I am using T4 template to generate POCO Classes which are then moved to another project. Can you please provide some tips as to how can we divide the EF model into separate models while using POCO Classes ?

With the "using" approach, how does that affect the startup performance of the app? Currently, we have one big edmx, and start-up performance is bad, because EF 4.2 processes the whole model at startup. (even with pre-generated views it is still too slow). With the using approach, will EF only process stuff from one diagram on startup (i.e. process the stuff from one diagram the first time something on that diagram is queried) or will it still process all the metadata for the whole application at startup?

The very first thing we need is an EF who deals properly with sub-models because we handle complexity with abstraction and clean code and less plumbing. LineOfBusines applications are most of the time as serious as the number of entities, and 100..900 entities with structure is no exception. I wait for EF7 being capable to handle these situations and for the time being I use creative work around's 🙂