I have a query which is running using a different execution plan on the development environment, compared to the pre-production and production environments, resulting in the development environment being significantly faster (it’s doing a lot less IO), despite the data in the databases being the same.

To try to rule out any issues being caused by server configuration, hardware, or load, I created a new virtual machine, and did a fresh install of SQL Server on it. I then synchronised the data back from the production- to the development environment using Red Gate SQL Data Compare, before taking a backup of the development database. I then:

1. Copied the .BAK file to the virtual machine, and restored it as a new database called “RestoredDB”.

2. Created a new (empty) database called “SyncdDB”, making sure the options were the same as those for “RestoredDB”.

3. Used Red Gate SQL Compare and SQL Data Compare to synchronise the database structure (tables, indexes, statistics, etc), and all the records, from “RestoredDB” to “SyncdDB”.

4. Rebuilt all the indexes on both databases (at this point they should be logically the same).

5. Ran the query on both databases (literally all I did was change the “USE” statement at the top of the query and hit “Execute”) – the query ran quickly on “RestoredDB” (it used the same plan as on the development environment), but slowly on “SyncdDB” (it used the same plan as on the production environment). The quick plan was doing ~6000 read operations, the slow one ~300,000 read operations – one ran in 2s, the other ~30s.

6. Restored the virtual machine to a snapshot taken immediately after step 4, rebuilt all of the statistics, and then re-ran step 5, with the same result.

Now, I’m a little puzzled as to why it’s doing this – the logical structure of the databases are identical (Red Gate SQL Compare, OpenDB Diff, and ApexSQL Diff all agree on this), the records are identical, and the indexes and statistics should be the same. It can’t be a hardware or loading issue, as the databases are on the same VM, which isn’t doing anything else.

Any ideas what could be causing the difference in plans? Could it be down to the way the data pages have been laid down on the disk..? Or..?

The query is actually two related queries - one to get a page of records for a grid/table control, and the other to get the count of all records (to do pagination).

P.S. I don’t have any control over the query – it’s being automatically generated by Microsoft Entities Framework and can potentially change – so I doubt forcing a plan is an option (although when I did this manually to make sure, the query ran at the same speed on “RestoredDB” and “SyncdDB”, as you would expect).

There are a lot of variables to this one that would be hard to help with. However if you get to the point of forcing the plan, look at Plan Guides. This feature was purposely added for exactly the situation you're describing, no control over the query/plan

I don't know the answer to your question - but given that the job of the query optimizer is to find a good plan in a reasonable amount of time, it is quite possible that two servers generate two different query plans. This could be dependent on the server settings - for example, once I ran into a similar problem which in the end turned out to be because one one server Auto Create Statistics was turned off and on the other it was turned on.

One solution would be what mfemenel suggested. Also go through the database settings (such as Auto Create Statistics, auto Updte Statistics etc.) to see if there are differences.

Thanks for the replies - it's all appreciated as this is a little out of my area of expertise!

quote:Originally posted by mfemenelHowever if you get to the point of forcing the plan, look at Plan Guides. This feature was purposely added for exactly the situation you're describing, no control over the query/plan

Thanks - this wasn't something I knew existed! Looks like an interesting and useful feature.

One question - the way they query comes in is as a parameterised query via sp_executesql. Looking at Books Online, I think the way to set up the Plan Guide would be as a "template" type, but that requires the query strings to be identical once the parameter placeholders are inserted..? The query sent to the DB server varies upon what filter, sorting, and display options the user selects (as a contrived example, the user may choose to omit the column "First Name", which is displayed by default, and replace it with "Initials", and opt to change sorting from "Surname" to "Staff ID" - both the "Select"- and "Order By" clauses would then change) - would that mean I'd have to create one Plan Guide for each possible resultset?

quote:Originally posted by mfemenelThere are a lot of variables to this one that would be hard to help with.

Out of idle curiosity, what kind of things would they be?

quote:Originally posted by James KI don't know the answer to your question - but given that the job of the query optimizer is to find a good plan in a reasonable amount of time, it is quite possible that two servers generate two different query plans.

I'm guessing that different hardware would give the possibility for different plans to be sensible. What's confusing me is that I get this when I put the copy of the Dev DB, and a clone (courtesy of Red Gate's tools) of it, onto the same server, and still get 2 different plans.

quote:Originally posted by James KThis could be dependent on the server settings - for example, once I ran into a similar problem which in the end turned out to be because one one server Auto Create Statistics was turned off and on the other it was turned on.

I've trawled through all of the database options on both, and they're the same. The database also has a nightly rebuild of the statistics, with 100% sample, on the production environment.

quote:Originally posted by James KAlso go through the database settings (such as Auto Create Statistics, auto Updte Statistics etc.) to see if there are differences.

As far as I can see, they're the same.

quote:Originally posted by jeffw8713How many processors do you have on the DEV system?

It's got a pair of Xeon L5410's in it (with hyperthreading switched off), and 8GB RAM. The production server has 16 cores (Opteron somethings, IIRC - don't have permission to see) and 24GB RAM.

quote:Originally posted by jeffw8713And what is the MAXDOP server setting?

On both Dev and Prod, it's "0".

quote:Originally posted by jeffw8713I have seen these kinds of issues when the production system is able to use a parallel plan vs a non-parallel plan and the non-parallel plan is much more efficient.

I couldn't see any "parallelism" entries in the plan, but tried running them with MAXDOP = 1 - it didn't make any difference. What's baffling me, is that if I put a restored copy of the DB onto a new virtual server, and sync the tables and data onto a new DB on said server, I still get 2 plans.