Deferred execution on LINQ to Entities

Date posted: 01/11/2015

Name:

*

My email:

*

Recipient email:

*

Message:

*

Fields marked as bold are compulsory.

You haven't filled in compulsory values.The email is not correct

Many people use entity framework but are not aware of what deferred execution is and how it may affect their projects. In this article, we are going to have a little talk about it and see how things work.

LINQ to entities and deferred execution

We take into account that we all have some understanding of entity and LINQ or lambda expressions.

To go through this post we created a table called Videogames containing four columns (ID, Name, MainCharacterName and Publisher) and filled it with some popular vintage videogames. Let's start with the following piece of code.

(There's nothing wrong in using lambda expressions when talking about LINQ to entities. It's all the same in the end and I personally think that it is much easier to read. If you preferred LINQ queries you could have written

Nothing strange so far. What is interesting about it is the deferred execution we talked about. In other words, the MarioVideogames variable will not be filled with data the moment it is declared but rather the moment it is used. To be exact, the moment it is iterated.

The IEnumerable interface requires the implementation of the GetEnumerator() method, which returns an enumerator to iterate through a collection, regardless of whether that collection is an IEnumerable, an IQuerable, or even a List.

So whenever we have a variable based on IEnumerable and we want to do something with that (that could be using foreach, First, Average etc) we need to iterate through that variable and as a result ask the query to be executed.

Why do we need deferred execution?

All right, since we know the query is not executed instantly, what would happen in the following case?

Sure, firstVideogameName would be "Super Mario Bros" and lastVideogameName would be "Super Mario Bros 2". Yet, what's interesting is that we requested the database twice. First, is when we executed MarioVideogames' query to get firstVideogameName value, second in the case of lastVideogameName.

Was that a good thing? Typically, yes it is. There's no reason in requesting data before we need it. Take a look at this.

In this case, MarioVideogames contains different data, depending on the type. Let's say, by default MarioVideogames would return 50 rows of videogames, where in case of type == 1 or type == 2, would return way less videogames. If things did not work that way, we would ask the database for these 50 rows, and after we got all of them, then we would filter out the ones who got the MainCharacterName we want. On the contrary, what happens now is we ask the database for videogames containing Mario in their Name and having specific MainCharacterName at the same time. The records we get will be the ones we actually want.

If for any reason we did want to execute the query at a given moment and get the data into memory we may use something like ToList or ToArray method. This way, instead of a simple IEnumerable, we can create a list that is stored in memory and has no longer anything to do with the database.

//Someone changes Super Mario Bros into Super Mario Bros 3 at this point

string firstVideogameName = videogames.First().Name;

string lastVideogameName = videogames.Reverse().First().Name;

If the query had already been executed before we needed it to, then firstVideogameName would be "Super Mario Bros", which would by now be outdated. So, as you can see deferred execution works and by default we get no such problems. Unless of course you do not wish to have updated data from the moment your program starts executing.

This time we request data as many times as videogameNamesList's contents are. If it consisted of one thousand strings, we would request data one thousand times.

Enumerating the query using ToList would do some good in such cases. Still, handling 50 Nintendo videogames is fine. What would you do if your database had tens of thousands of Nintendo videogames? Things would go really slow if you had to fetch fifty thousand records in order to get no more than five of them in the end.

Caching could be used to get us out of such trouble, even though there is no easy way out of this. It all depends on what you want to do, what problems may arise and how you expect your project to work.

Summary

Creating a LINQ query on entity framework does not instantly get us data from the database. Instead, that will happen the moment we need them. This is called deferred execution and it helps us avoid unnecessary database requests. However this may also turn out to be more of a trouble than help so we should always keep an eye on how our queries are executed.