After the nice talk with developers.ie, it is really nice to see that people have interest in the topic. Unfortunately the quality of the recording was not very nice and connection dropped twice, so I decided to put together this blog post to show how we can work with leaving persistence polluted entities on our behind.

Why is it so important ?

I can hear lots of comments from people around me mainly concerning around “Why do we need this much hassle, when we can already have designer support, and VS integrated goodies of an ORM mapper ?”. First, I have to say that it is fair enough to think in this way. But when things start to go beyond trivial, you start to have problems with persistence or technology polluted entities. On top of my head, I can think of the following:

Technology Agnosticism is a bliss : This concept is usually revolving around PI (Persistence Ignorance), but it is not only that. Persistence Ignorance means that your entities should be cleared of any persistence related code constraints that a framework - usually an ORM - forces on you. This is, for e.g. if you have attribute level mapping where those attributes are not part of your domain but are there just because some framework wants them to be there, then your domain is not persistence ignorant. Or, if your framework requests you to have specific types for handling associations to be used, like EntitySet and EntityRef s in Linq to SQL , same goes for you. This can also be another technology that wants your entities to be serializable for some reason. We need to try to avoid them as much as possible and concentrate on our business concerns there, not to bend our domain to be fitting into those technological discrepancies. This approach will also promote testability. The same goes for the need of implementing an abstract class, or interfaces like INotifyPropertyChanged when you don’t want them.

Relying on Linq to SQL Designer is painful: Designer puts everything in one file, regenerates files each time when you save so you loose your changes such as xml comments. Needless to say, the only OOTB support is attribute level configuration, even for XML you need to use sqlmetal tool out of designer process.

Configuration should not be anything that your domain to be concernedabout: Unless you are building a configuration system

Let’s get geared

In the light of this, when we are working with Linq to SQL designer, we tend to think that it is impossible to achieve POCOs, but indeed it is: solution is don’t ditch POCOs, just ditch the designer While implementing POCOs, we need to know a couple of things beforehand about Linq to SQL internals, because we will be on our own when we have any problems.

EntitySet and EntityRef are indeed useful classes, and they are there to achieve something. When you add an entity to an association, EntitySet manages the identity and back references. That is, for children you need to assign the correct parent id to the child otherwise you will loose relationship. Same goes for EntityRef and for 1-1 relations.

INotifyPropertyChanging and INotifyPropertyChanged are there not only because of informing us by providing the ability to subscribe to necessary events and get notified when a property is changed, but to leverage lazy loading as well. When we discard them, we are back to eager loading.

Enough Rambling, let me see the wild world of code

For this post, I will only focus on the first part, so the lazy loading is a matter of another one. The approach we are going to take is, use the XML mapping instead of attribute based modeling. I am gonna use the trivial Questions and Answers model, where one question can have multiple Answers associated to them. Here is how it looks like :

Question and Answers entities

And their related code is pretty simple, nothing fancy. Here is the Answer POCO :

1:publicclass Answer

2: {

3:

4:public Answer()

5: {

6: }

7:

8:privateint _QuestionId;

9:

10:publicint QuestionId

11: {

12: get

13: {

14:return _QuestionId;

15: }

16: set

17: {

18: _QuestionId = value;

19: }

20: }

21:

22:

23:privateint _AnswerId;

24:

25:publicint AnswerId

26: {

27: get

28: {

29:return _AnswerId;

30: }

31: set

32: {

33: _AnswerId = value;

34: }

35: }

36:

37:privatestring _AnswerText;

38:

39:publicstring AnswerText

40: {

41: get

42: {

43:return _AnswerText;

44: }

45: set

46: {

47: _AnswerText = value;

48: }

49: }

50:

51:privatebool _IsMarkedAsCorrect;

52:

53:publicbool IsMarkedAsCorrect

54: {

55: get

56: {

57:return _IsMarkedAsCorrect;

58: }

59: set

60: {

61: _IsMarkedAsCorrect = value;

62: }

63: }

64:

65:

66:privateint _Vote;

67:

68:publicint Vote

69: {

70: get

71: {

72:returnthis._Vote;

73: }

74: set

75: {

76: _Vote = value;

77: }

78: }

79: }

Yeah, clean, pure C#: No attributes, EntityRef s, nothing. Same goes for Questions as well, where the association is achieved through the good old simple List<T>:

1:publicclass Question

2: {

3:privateint _QuestionId;

4:

5:publicint QuestionId

6: {

7: get

8: {

9:return _QuestionId;

10: }

11: set

12: {

13: _QuestionId = value;

14: }

15: }

16:

17:privatestring _QuestionText;

18:

19:publicstring QuestionText

20: {

21: get

22: {

23:return _QuestionText;

24: }

25: set

26: {

27: _QuestionText = value;

28: }

29: }

30:

31:private List<Answer> _Answer;

32:

33:public List<Answer> Answer

34: {

35: get

36: {

37:return _Answer;

38: }

39: set

40: {

41: _Answer = value;

42:

43: }

44: }

45: }

To use these entities as POCOs, I need a way to externally define the mappings between db tables, columns to the relevant object fields. I chose the other OOTB supported way, XML. As I am so lazy to write it on my own, I ran the following sql metal command to generate it from the DB:

Aha, well this was kinda expected. We knew that we had to maintain the identity and back references, but we didn’t. Shame on us. But how are we gonna do that ? We don’t know the ID value before we insert, how do we tell Linq to SQL to pick the new identity ? Are we back to square 1, @@IDENTITY_SCOPE ?

Of course if I am writing this post, the answer has to be no The secret is in the back reference, the back reference is there just because for this matter.

What we need to do now is, in each Answer we need to preserve a reference to the parent Question and for each question that is added, or when the list is overriden we need to assign the Answer’s QuestionId property to this back reference’s one. As we now don’t have the EntitySet, we need to do that on our own, but it is easy enough. For Answers, here is the back reference:

1:private Question _Question;

2:

3:public Question Question

4: {

5: get

6: {

7:returnthis._Question;

8: }

9: set

10: {

11:this._Question = value;

12:this._QuestionId = value.QuestionId;

13: }

14: }

And for Question POCO, when the List is overriden, we need to put our own logic to handle this, which is: for every child answer, ensure that back reference and the reference id is set:

1:private List<Answer> _Answer;

2:

3:public List<Answer> Answer

4: {

5: get

6: {

7:return _Answer;

8: }

9: set

10: {

11: _Answer = value;

12:foreach (var answer in _Answer)

13: {

14: answer.QuestionId = this.QuestionId;

15: answer.Question = this;

16: }

17: }

18: }

And Test passes after doing this. Hope this gives some idea what you can do and what you need to know beforehand.

Conclusion

Desire to decouple domain entities from Technological aspects is important in SoC & SRP, and these principles are important for nearly everything, varying from pure basic to DDD. To achieve this in Linq to SQL, we need to say good bye rid to EntiyRef, EntitySet, INotifyPropertyChanged, INotifyPropertyChanging interfaces.

The next subject I am going to attack is Lazy Lading with POCOs, stay tuned till then !

Share it on:These icons link to social bookmarking sites where readers can share and discover new web pages.

From a modeling point of view it may make sense that an “Answer” knows to which “Question” it is an “Answer”, however, the decision was forced upon you by the framework, was it not?

What is more, in your code there’s pretty little you can do with the knowledge of the question id. You would have to look up the corresponding question first, quite probably accessing your persistence infrastructure.

I hate the fact that Linq to sql requires empty public constructors, a trick I’ve been using is to decorate the constructor with: [Obsolete(”Persistance only”, true)] - that way the compiler complains if any code tries to call the constructor. Linq to sql doesn’t care so it works …

Also I’d like to have all the id’s in your example private. I think that Linq to sql supports that. (Protected at least)

The decision is forced by the framework indeed, but it is fair enough because you need to tell the framework what to put into those ID fields upon insertion. This enforcement is even in the DB level, as a foreign key in Answers table.

The easiest way to achieve this is to have back references, and XML Mapping Source Supports them natively. But that is not the only way, you can build another custom mapping source which doesn’t need back references but uses some other kind of MetaData to manipulate IDs.

On the second point, you don’t only have the ID but you have the parent Question object itsself. So with that you can do anything you want (except lazy loading, you need to handle it customly).

That is pretty cool that you can do that.What parts of this solution make it better than what NHibernate does (Xml mapping files)? Also, the beauty of Linq2Sql is that I don’t have to worry about writing DAL code, but I do hate gross generated code that has a lot of dependencies. It still is really awesome and you did a great job; can’t wait to see the lazy loading post.

I am very impressed with your work here. I love Linq and I love how simple Linq2Sql is but was never a fan of the ugly ugly generated code it produced with the EntityPeristence frameworks.

The approach you listed here is quite possibly the best approach to a DAL / ORM wrapper that can be achieved. I’m also impressed with how cleanly the XML is for the mapping as I’ve worked on projects with mapping frameworks that were just abysmal to work with.

Sidar Ok said,

On Nhibernate mappings comparison, NHibernate XML mappings are even more poweful in a lot of circumstances, and also handling multiple databases. I tried to achieve something that NHibernate already supports and encourages.

One advantage though I can say, you don’t have to mark your fields virtual to get populate as in NHibernate.

Sidar Ok said,

I have been watching the series, but it has been a while now so I’ll recheck. Site is down currently.

On the other hand, the XML is generated, and you barely change it. And it is not complex. Look at NHibernate XMLs to see what complex mapping is

That’s why indeed you can do a lot of things in NHibernate XML, and the same reason goes for changing it. You can’t do complex things in built in xml model of linq to sql, that’s why the xml is not that complex.

I hope it makes sense.

Update: Site is up now, I am downloading the screencast. I already have some words to say, keep an eye here.

11

Sidar Ok said,

I rewatched the screencast to reassure, as you said he is mapping to his own entities which has completely different scope from what I am trying to do here. He is creating another layer of mapping, and whole purpose of this post is trying to avoid that.

What are the advantages of that approach? First, you are not bound to any mapping mechanism (TPH is no more a limit for you), second the mappings are testable as he also tests them in storefront 3.

What are the disadvantages ? Loads of handcrafted mapping. You end up having a lot of classes which means you need to plan, develop and worse - maintain. When you work in mapped objects, you will also not get the goodies of ORM Mapper, and you will need to manage the mapping process in both ways.

If my ORM mapper of choice is enough for mapping domain objects, than I prefer to avoid hand crafting the M part of the ORM, because I will likely re invent a flat tire.

Hi Sidar - very nice write up and well done with SQL Metal :). Stephen Walther turned me on the the XMLMappingSource a while back and suggested I use it instead of the “hand-coding” stuff I was doing with the repos. I flat refused - I hate XML :).

For me the “Loads of handcrafted mapping” is really not an issue. It’s apparent, it’s readily visible, and I do it once. Moreover I can use the power of code rather than weird XML translation to do things like cast and create objects - as well as implement my favorite new thing the LazyList :).

In terms of “ending up with a lot of classes” - you’re going precisely the point I was going to raise. If you generate XML to handle the mapping, and you add a new class (let’s say you add a User class with a Users table) - won’t you need to regenerate? And when you do, what happens to the custom mapping stuff you’ve done? I’m going to assume that you’re going to want to manually change your XML here yes? This is the precise reason I didn’t go with XML - it’s 10 times the code and you end up with a complete nightmare when you want to change a mapping setup (my preference - some people love XML).

So in terms of maintenance - I prefer the simple Linq Projection approach - clean and easy. But that’s me - and I LOVE what you’ve done here. I’m curious as to how you’d maintain it as your application grows to include Users, Tests, Courses, Curriculum, etc…

13

Sidar Ok said,

Thanks for the nice words, I love and constantly watch what you are doing with MVC Storefront, espeically encouraging TDD.

You will be surprised, but I don’t like XML either. Main reason I chose it over another way is that I focused on mapping directly from table to the entities, and use Linq to SQL mapping mechanism - not to home bake mine, not because I am a big fan of XMLs.

The point here is not XML is good for mapping or not, even a better approach is that you can create a Fluent Binder for POCOs but STILL avoiding another level of mapping. If you are doing another level of mapping, there are 2 possibilities: Either your ORM is not good at mapping and not supporting your custom needs (such as multiple table mapping rather than TPH ) or you don’t fully utilize your ORM’s capabilities. If you stay in TPH Model in your domain (or AR), there is absolutely no gain in maintaining 1-1 mapping between Linq to SQL DTO s and your domain model, and that layer is a code smell in my opinion.

The other point you seem not to be mentioning is that mapping is not a 1 way issue, it is 2 ways issue. When you GetCategories, it is 1 mapping, and when you UpdateCategories, it is another. This is TWICE maintainance cost. (Fancy Right-Left comments in Store Front ?)

The only case I can understand this mapping layer is when you need custom mapping and your ORM does not support it. If it does, then use it. If it doesn’t, but what it supports is enough for your domain needs (as is in MVC storefront, mostly the mapping I could see is 1-1) then use it again.

And lastly for XML goodness - It is 1 single file. It is external. It is generated. There are cases you do custom mapping, like if your database naming is crap and you want to use good names, but it is rare for other conditions. When you use even the same naming as in MVC storefront, the maintenance cost is just to copy & paste sqlmetal. In most cases it is even guaranteed to be true without writing tests When my application grows, I won’t even change my sqlmetalscript.txt file but you will need to change a lot of mappings

As a conclusion, my take on using ORMs is if you can use the Mapping part of it, do. That’s why M stands for there Don’t reinvent the wheel when you are traveling in a ferrari.

The point is that you as a developer who’s hired to build an app for a client, should precisely do that: build the app for the client. What I then find surprising is that you think it’s necessary to invest time to get to the point where you can start writing the app for the client, i.e. to avoid being busy with infrastructure etc… but to get there, you need to be busy with infrastructure, at the expense of the client.

If the infrastructure is a proven one, well tested in many scenarios (so other than the one you initially think you’re going to need), so no surprises there, why ignore that one to build your own infrastructure?

Oh, and O/R mapping isn’t about mapping a class field/property to a table / view field. That’s the persistence part of an o/r mapper. O/R mapping is much much more than that. Sure that ‘much more’ comes at a cost, namely you likely have to obey the rules the O/R mapper forces upon you, but you also have to obey the boundaries you’re creating yourself by achieving the exact same thing. You see, the myth of ‘PI’ or ‘POCO’ is just that: a myth: an entity could be clean from persistence code, but still not be POCO. Or you have to obey rules like parameterless ctors, usage of IList, virtual properties etc. and that’s just for persistence. Once entity management gets into the spotlight, you definitely don’t want to use POCO, as it is then becoming synonym to ‘do a lot of hand-coding to achieve the same’.

Example:
myOrder.Customer = myCustomer;
in line of business apps, the following is then very handy: what if that line above also did:
myCustomer.Orders.Add(myOrder);
_and_ syncing fk/pk (if the system uses that system) under the hood? And vice versa? That way, with simple databinding and a grid, you can build an entity graph without problems which is kept in sync. Sure one can add that with hand-coding, but that’s precisely the point: with a system which is much more than persistence, you don’t have to: it’s built in.

The advantages?
- you don’t have to spent time on this, so your client has the app earlier and cheaper
- you don’t have to debug this as it’s proven code in a framework which is tested in many different scenario’s.

The main problem I think with the current ‘PI/POCO or bust’ myth is that it is synonym to ‘model first’, which is not correct. POCO on .NET is only doable if you can live with the boundaries/rules which are forced upon you due to the lack of a class-loader approach as in Java. And then again, the abstraction is leaky anyway: a database isn’t ‘free’, it’s not some resource you can ‘abstract away’.

Like how to model an entity hierarchy in a relational model? 1 table? multiple tables? there are rules how to do that, as a lot of research has been spend on that, and also what the consequences are of both approaches and when you should do which. By just looking at classes and lie to yourself that the database ‘doesn’t exist’, is silly, you will likely be bitten by the consequences your decision will have when the app is put in production.

Btw, Sidar, if you want to take a stab at me again in a mailinglist I don’t post to, please mail me directly, ok? Thanks.

>> You see, the myth of ‘PI’ or ‘POCO’ is just that: a myth: an entity could be clean from persistence code, but still not be POCO

That’s why I used technology agnosticism instead of PI.

As I said in the post, the aim is to make your domain cleared out any of the technological concerns so that you can test, and focus only and only on your business concerns=not to DB related issues. As you said, I am not paid to write an ORM in my client’s timeline, so the same goes for another framework that I need to “obey”. E.g, your entities need to be data contract serializable to put them on the wire with wcf, so they are not POCO s as well.

>> The main problem I think with the current ‘PI/POCO or bust’ myth is that it is synonym to ‘model first’, which is not correct.

Even if you are working in a DB First approach, you will want POCOs for the reasons above.

>> And then again, the abstraction is leaky anyway: a database isn’t ‘free’, it’s not some resource you can ‘abstract away’

If you mean DAL, it is silly to try that abstraction anyway. But In BL, there are more important concerns that you need to focus on than bending your rules to match the ones in database. That’s the whole purpose of DAL.

>> Btw, Sidar, if you want to take a stab at me again in a mailinglist I don’t post to, please mail me directly, ok? Thanks.

It wasn’t a stab, it was a shield against a stab. I didn’t know that you weren’t on the list, I was warned by Casey, so apologies. (I didn’t want to have that argument under that blog post as well).

“As I said in the post, the aim is to make your domain cleared out any of the technological concerns so that you can test, and focus only and only on your business concerns=not to DB related issues.”
That’s a nice idea but in practise you’re not going to get that. the reason is simple: every framework you use, even if it’s the .NET framework, gives you boundaries you have to obey. Instead of bending the framework to what you want it to be, bend your work so it matches the framework. With POCO’s you get other problems, like databinding for example: if you really want databinding aware stuff, you have to write code for that. If you get it for free from a framework, why bother doing it yourself? It’s similar to other aspects of what you have to do.

The core point is: if you can’t focus on your core work, the business logic / functionality of the app you’ve to build, due to a boundary of a given framework: choose another framework, agreed, however you can’t achieve something like ‘this idea will match with EVERY framework out there’, there’s always a consession to make, you always have to do a lot of work in one way or the other, simply because to achieve that you have to choose a framework which lacks certain things so it doesn’t give a boundary someone might object to.

And this goes very far. You might argue that the BL shouldn’t be limited by a resource like a database, but that’s just hogwash. In practise your BL _IS_ limited by what a resource like a database can give you. If your choice of BL construction performs very poorly with thousands and thousands of users and millions of rows in a db, you will have a problem. I.o.w.: ignoring it isn’t wise.

Sure, that doesn’t mean you therefore should build everything around stored procs etc. ;). What I mean is that you should go the route you illustrated however don’t avoid the boundaries by ignoring them but by realizing that they’re there and work with the situation at hand.

With my entities I can write db agnostic code, without persistence in sight (e.g. in a client app which pulls the entities from a service), and at the same time without having to write 1 line of infrastructure code to make this happen. Sure, I don’t get the POCO Foundation’s medal as the entities aren’t POCO, but that’s also not important: if the framework at hand gives me the space WITHIN I can write the software required why bother rebuilding the environment?

Have you checked out EFPocoAdapter? Its something (http://blogs.msdn.com/jkowalski/archive/2008/09/09/persistence-ignorance-poco-adapter-for-entity-framework-v1.aspx) that will allow you to utilize EF and work with POCO objects. I have successfully used in in a number of projects and architectures. I am planning on doing a 6 Series blog post on how to use it and setup, etc. Give it a shot, and be on a lookout for my blog posts on this.

It just seems that once you have it setup, there is no need to manually regen xml or code files by using this approach.

Sidar Ok said,

Yes I did check out - It is a code generator, which generates another layer of mapping. Please see my response in the above comments to Rob on another layer of mapping.

On top of them, EF is already very heavy with 2 levels of mapping, adding 3 rd level to this stack where I just need 1 level is an overkill and overhead. It is not a solution to the problem, it is just a *workaround*. And a work around that has all the problems inherently from code generation perspective, e.g source control nightmares, maintaining untested mappings, loads of classes that are not used, being bound to the generation technology etc. The more layers you are away from ORM’s Context, less you get the goodies of it or depend on generation to get them - why would I, when I don’t need to increase the distance ?

I still need to regen my Adapter layer and Proxies for lazy loading when a change occurs in the db, it is inevitable. So no positives on that side either. (btw, the amount of mapping you need to regen when db model changes in EF is already tremendous, not to mention the poco adapter). L2S is very easy and provides lots of flexibility in this context, does what it needs to, doesn’t what it can’t do good.

On the other hand, I will keep an eye on your blog and already looking forward to the posts. Thanks for the comment.

Sidar Ok said,

Lately I have been spending a lot of time with that and bumping into the limitations of L2S development discrepencies. I will also discuss them in another blog post, but bear with me, I am close to there

Sidar Ok said,

Chris said,

I ran SQL Profiler against a LINQ query similar to the one shown in your GetQuestion() implementation. The generated sql is a join of the parent/child tables producing a single result set. A row will be returned for each child and that row will include the parent info as well. Sending the parent info over with each child seems unnecessary. Is there a way to have LINQ generate multiple queries for a parent/child relationship?

Sidar Ok said,

If you set the load options to eager load, it is logic for the DataContext to perform a join between children and parents. Try doing context.Parents.Select() and context.Children.Select() separately and adding them on your own. This way, it will perform 2 seperate queries as you are looking for.

But out of curiosity, why do you want to perform a retrieve both on children and parents and not do a join ? What’s the requirement ?