Introduction

The purpose of this article is to describe the technique I have used to implement the Repository pattern in .NET applications. I will provide a brief description of the Repository pattern and LINQ-to-SQL; however, if you are unfamiliar with these technologies, you should research them elsewhere. The goals of my implementation are:

it must be a general purpose design that can be reused for many projects

it must facilitate domain driven design

it must facilitate unit testing and testing in general

it must allow the domain model to avoid dependencies on infrastructure

it must provide strongly typed querying

Repository Pattern

The Repository Pattern, according to Martin Fowler, provides a "layer of abstraction over the mapping layer where query construction code is concentrated", to "minimize duplicate query logic". In practice, it is usually a collection of data access services, grouped in a similar way to the domain model classes.

By accessing repositories via interfaces, the Repository pattern helps to break the dependency between the domain model and the data access code. This is invaluable for unit testing because the domain model can be isolated.

I implement the Repository pattern by defining one repository class for each domain model entity that requires specialized data access methods (other than the standard create, read, update, and delete). If an entity does not require specialized data access methods, then I will use a generic repository for that entity. A repository class contains the specialized data access methods required for its corresponding domain model entity.

The following class diagram shows an example implementation with two domain entity classes: Shape and Vertex. Shape has a specialized repository (IShapeRepository). Vertex does not have a specialized repository, so it will just use the generic repository (IRepository<Vertex>).

LINQ-to-SQL

LINQ is a strongly typed way of querying data. LINQ-to-SQL is a dialect of LINQ that allows the querying of a SQL Server database. It also includes object / relational mapping and tools for generating domain model classes from a database schema. LINQ is an excellent addition to object / relational mapping tools because it facilitates strongly typed queries, such as:

IRepository<T>

The generic interface IRepository<T> defines the methods that are required on each repository.

publicinterface IRepository<T> where T : class
{
///<summary>/// Return all instances of type T.
///</summary>///<returns></returns> IEnumerable<T> All();
///<summary>/// Return all instances of type T that match the expression exp.
///</summary>///<paramname="exp"></param>///<returns></returns> IEnumerable<T> FindAll(Func<T, bool> exp);
///<summary>Returns the single entity matching the expression.
/// Throws an exception if there is not exactly one such entity.</summary>///<paramname="exp"></param><returns></returns> T Single(Func<T, bool> exp);
///<summary>Returns the first element satisfying the condition.</summary>///<paramname="exp"></param><returns></returns> T First(Func<T, bool> exp);
///<summary>/// Mark an entity to be deleted when the context is saved.
///</summary>///<paramname="entity"></param>void MarkForDeletion(T entity);
///<summary>/// Create a new instance of type T.
///</summary>///<returns></returns> T CreateInstance();
///<summary>Persist the data context.</summary>void SaveAll();
}

Repository<T>

IRepository<T> is implemented by a generic repository base class, Repository<T>. Repository<T> is a base implementation that provides data access functionality for all entities. If an entity (T) does not require a specialized repository, then its data access will be done through Repository<T>.

IShapeRepository and ShapeRepository

It is usually desirable to provide more specialised repositories for entity classes. If our domain included a shape entity, we might like to have a ShapeRepository with a RetrieveByNumberOfSides(int sideCount) method. Such a class would be exposed to consumers as a specialized interface IShapeRepository:

Firstly, thank you for not blasting me for sounding rude. Your code seems to be well thought out and well written. I have not actually stepped through it. My problem is not with your code, but with Linq to SQL, a technology I greatly enjoy but whose shortcomings give me great angst. L2S has great potential. Unfortunately, in its current implementation, as it relates to web applications, L2S is little more than a glorified pipe to a database. Dont get me wrong - There are a lot of things I like about L2S - namely entity class generation. But as far as functioning as a repository, L2S does not (for n tiered apps). The code you've written is good but in IMHO is written against a model that is not yet complete.

My basic question was: My objective is to allow the user to create an entire order without writing anything to the database. Once the user clicks save I want to write the whole thing at once. How do I accomplish that?

The relevant portion of the answer is:"...if you use entity objects to store data collected from users before submitting to the database you need to track entity state (new/modified/deleted etc) yourself across postbacks/page lifecycles and then use the L2S datacontext to read/write from the database."

So all the state tracking built into L2S is not only useless but cumbersome. L2S is not a repository. It does not afford me any mechanism for tracking entity state across layers of my application. It provides minimal support for (de)serialization, and I have to manage storage of objects in viewstate or session myself. That should be L2S's job! The code you wrote, while filling a certain need, does not appear to address this deficiency. Thus it becomes more icing on an inedible cake.

I'd love to be wrong about this because I'm knee deep in a very large app where these problems are killing my schedule!

I agree that Linq-to-sql has some terrible flaws. And it seems now that they will never be fixed, however, your problem is with the web and the stateless nature of the HTTP protocol. Nearly everything in a web app happens in the lifetime of a HTTP request so there is no good way to persist state across requests. The problem you described exists in linq-to-sql, nhibernate, entity framework, ruby on rails and every other web development environment.

NHibernate has a solution called session-per-conversation which effectively uses the asp.net session to store the object context between requests. The other options are to use ajax to rewrite your application so that your entire order is created in a single HTTP request. Finally you can have a way to flag the various parts of the order as not being complete until the final save.

Don't store objects in session / viewstate. There are great performance problems with such an approach. Either keep everything on the same page, or persist to the database.

>>The other options are to use ajax to rewrite your application so that your entire order is created in a single HTTP request. Finally you can have a way to flag the various parts of the order as not being complete until the final save.

Yikes!! Do you have any idea how difficult that would be!! Plus there are concurrency issues.

I really need a repository that is in fact, a repository. And, as you said in your article, it should be reusable, domain driven, independent, and strongly typed. I should not have to write to temp tables or wrestle with session or write out temp files. Indeed, L2S would be just fine for me if it did what it's documentation says it does.

Just a few hours ago I stumbled on .net RIA which you can read about here.[^]
I'm not sure if is just another blind alley but it looks interesting.

I know I sound bitter but actually I'm not casting blame If I were to blame L2S, it would not be for the nature of web apps, but for a failure to work with the nature of web apps. I'm sure there are ways for L2S to work with a stateless protocol. You can probably think of one or two yourself right now. For example, wouldn't it be great to be able to call methods to serialize and write a datacontext to a row on a temp table. And another method to read and deserialize it. If something like that could be used with code like you've written in your article, the appearance to the other layers in the app would be one of a stateful repository.
For the record I am not proposing the preceeding as a single solution to the problems I've discussed. It is just an example I've pulled out of thin air to make the point that some kind of stateful repository is not an impossible objective. It may not be optimal or scalable, but it's not impossible. See this[^]

Is there any possibility to implement something like T FetchByKey(object key)? I tried using FindAll with a Func that checked the key property value, but of course this could not be translated to SQL so I failed.

I'm a bit confused: when you create a repository, you inject an instance of IDataContextFactory. Why not an instance of DataContext itself? If the factory is important, could you provide an implementation?

Hmm could you please check it again? I couldn't find any implementation. Do you just keep the context in HttpContext? Why put SaveAll in the factory interface, isn't it the context's concern?

I'm totally new to the DLinq stuff, and I'm trying to build my first DLinq Web site based on your article. Frankly, I don't have a clue about the scope, so any further info would be appreciated.

So, I think your article is good, but it could be improved in either removing the Factory stuff (since it's not absolutely required for DLinq and leaves questions for newbies like me), or explain a little about why do you need the Factory.

I tried your approach because it sounded very nice. I do have some comments I'd like to share.

By defining your virtual methods like this: public IEnumerable<T> FindAll(Func<T, bool> exp) you create a enormous bottleneck. This is because the Func will not be (what I call) a delayed method, but it will be expanded immediately. I checked the log from the datacontext and it shows that the table is no longer queries using a where clause. Instead, a full table load is performed and the function is evaluated in memory.

However, the solution is very easy, although this mean you'll have to include the Linq assemblies to your testing bed:

The trick is to wrap you functions in a System.Linq expression. Client usage is exactly the same. The performance gain on my project was huge, If you consider that tables are no longer loaded completely into memory.

I was curious how you are using the entity objects. Are you passing the "Shape" and "Vertex" Linq-to-SQL entities through the rest of your tiers? or are you creating separate business data transfer objects and mapping the data from the linq entity to them?

I ask because I have implemented a similar method and was curious how you got past this difficult problem.

Firstly, I only use one tier. The excepted definition of a tier (as opposed to a layer) is that one tier corresponds to one physical machine. Therefore, I use architectures that are multi-layered but single-tiered.

The "Shape" and "Vertex" objects are passed through the rest of my layers, within the lifecycle of a single http request.

I see the same problem. in fact I am having trouble seeing what if this code does anything at all. If I have to use a different application it keeps the service layer in tact but it doesn't allow me to easily change the data layer. If we change any of the data layer it will in effect go right through and break my application(s). Correct me if I am wrong?

What it does is provide a clean abstraction for data access, flexible enough to cope with the needs of the application but still allowing proper separation of concerns. For more information I recommend Domain Driven Design[^].

I have been using LINQ to SQL for some time now and I have discovered that LINQ to SQL with multiple repositories does not work. The reason is that when you create a repository object and add an item into it then it uses DataContext to insert the item into the database. Now, if you use some other repository and create and object and assign the object which was created from a different repository then your update, insert etc operation will fail. This is because the object1 was retrieved by other datacontext then the object2 which is being persisted by some other datacontext.

This is a big issue with LINQ to SQL. If you like to read more then check out the following post:

I agree. I use Linq to sql for web applications and I scope the data context to a http context, so I am never working with more than one data context. I believe this is the same for all object / relational mapping tools.

HttpContext is the natural scope for unit of work in a web application. We need some way to create a DataContext when a request comes in, and abandon / complete that DataContext when the response is written. This is what Jeremy Miller recommends to do when scoping StructureMap (his DI container) in a screencast he did with Rob Conery (http://blog.wekeroad.com/mvc-storefront/mvcstore-part-13/[^])

I understand your concern, but in practice I don't think it is a problem.

Hi, I'd like to begin by saying that I like your article and how the repository pattern can be implemented with LINQ.

I've got to say that since your data context is already exposed in the UI layer, there's little to stop developers (assuming that a team is working on this) from accessing it directly. I don't feel comfortable with having the data context in the presentation layer. This approach can also potentially result in a very bloated object graph (even taking into consideration lazy loading). Master-detail screens are an example.

I'm also curious how this works for transactions which span multiple screens. With the data context scoped to a HttpContext, is this ever a problem for you?

"I've got to say that since your data context is already exposed in the UI layer, there's little to stop developers (assuming that a team is working on this) from accessing it directly."

This is not correct. There is not even a reference between the web project and the project containing the datacontext. Even if it was I don't feel confortable assuming that developers are undisciplined and not trustworthy. Not every design principle needs to be enforced by the compiler.

I suppose I probably do have what you would consider a "bloated object graph" but that is the way I like it. Domain-driven design requires a rich object model. Since objects are inexpensive to create a destroy I don't see the problem.

I design my applications so that transactions do not span multiple requests. I can't think of a good reason why you would want them to, and it is difficult to implement with any design. Do you design your desktop applications so that transactions can span a reboot? Scoping the datacontext to a request is a common design pattern that is used regularly with ORMs including NHibernate and Wilson ORMapper. For a web application there is no other scope for a domain model.