Smile! You’re at the best WordPress.com site ever

Main menu

Post navigation

Improved Merge Join algorithm in SQL server 2008

I assume that you know how merge join works. There is however an issue with the existing merge join algorithm in SQL server 2008. I am sure that it could be improved further to improve the perfroamcne by reducing the number of rows processed and reducing the number of IO’s.

When the inner table is accessed the scan starts at the first row and the table is scanned till it find the value of join key which is more than the value of join key from outer table. As soon as the join key value is more than the outer table join key row the scan is not required. This is quite efficient algorithm to end the scan. However, the scan always starts at the first value or first row( I am assuming that there are no other sarg’s which are causing the scan to start somewhere else or cause other index to be used). This part could be veru inefficient if say the last value of join key is at almost at the end of the inner table. Thus the whole clustered index or table is scanned where it might have possible that we just got say 1000 rows and thus scanning the pages for those 1000 rows.

Solution: For me the scan should start at the minimum value of join key from outer table. e.g. say we have clustered index on id and say outer table has minimu value 100000 and max value 105000. Thus the scan should start at 100000 and end at 105000.Currently it is not happening.Currently it is starting at 1 and finsihing at 105000. Thus the rows from 1 to 99999 are uselessly processed. This is waste of CPU as well as it causes more logical reads.

Below script will show you the issue in more detail and will show you the alternate method.

–The proc got executed but a nested loop join was used as it is better than merge join.But I will be showing the better merge alogorithm
–so i will force the mereg joins so that nested loop and hash joins are not used at all. This is not recommneded in prod systems.This is just for demo
–purpose.

–This is the first date in the table and thus the see actual number of rows from msyod table is 362 which is one more than
–actual number of rows retruned. Thus the merge join is quite good.
–Now I will use the next date say 20010702

set statistics io,time on
go
exec ExistingMergeAlgo @orderdate = ‘20010704’
go
set statistics io,time off
go
–now the rows are 374 instead of 381..as the date value will increased so will be actual number of rows processed by CPU
–and also the IO’s as we have to start the scan and have to read till the last value of salesorderid.

–now for this date..actual number of rows are 121317 which are total number of rows and look at the IO’s it is around 1502
–which is very high..There were just 96 rows retruned…I expected that optimizer would have started the scan for mysod on
–the minimu salesorderid value..

–I just wanted to show that the merge join was starting the scan of the table mysod from the first leaf page
–and then keep on scanning unless the salesorderid value exceed the salesorderid value in the mysoh table
–for given date range. as the date will increase the scan will start at the first page and will continue
–till the last value in salesorderid.

–This is the problem I am going to discuss with respect to the merge join. The optimizer is smart enough to know
–where to stop scanning the second table. However, optimizer is not smart enough to find the start point of the scan.
–The left part is sorted and we have the 47311 as the starting salesorderid (for date 20010704).Thus optimizer should start the scan
–from the 47311 and then do as it does usually scan till the last value is reached..
–I hope in future optimizer will be able to do this and this will make sure that we scan minimum number of pages and rows thus
–reduce both the logical reads ,cpu processing.

–WE can not wait till the Microsoft put this feature in the sql server optimizer. However, we could have a workaround for this.

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@orderdate datetime
)
as
begin
declare @minsalesorderid int
select @minsalesorderid= MIN(salesorderid) from mysoh where OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.OrderDate between @orderdate and DATEADD(dd,1,@orderdate)
–we will start processing from the first salesorderid of the date range rather than first row of the table..
and sod.SalesOrderID >= @minsalesorderid
end
go

–Now actual number of rows are 10 instead of 367 rows. This means that the optimizer had to process 3% rows of what it was doing actually.
–This has reduced the logical Io’s as well because actual scan started at the min value of salesorderid rather than the first row in the table
–try for next date

–actual number of rows are now just 96 instead of 121317 as shown above..
–Also the IO;s are just 5 instead of 1502. This is quite a big imrovement in
–the IO’s and it is just not IO’s the rows processed are just 96. This means
–less cpu intensive work..

–the data was perfectly clustered and thus the perfromanc eimprovement was way too good. Now Let us try some other column say customer id..

create nonclustered index idx_customer_mysoh on mysoh(Customerid)
go

drop procedure ExistingMergeAlgo
go
create procedure ExistingMergeAlgo
(
@customerid int
)
as
begin
–declare @minsalesorderid int
–select @minsalesorderid= MIN(salesorderid) from mysoh where customerid = @customerid
select *
from
mysoh soh
inner merge join mysod sod
on soh.SalesOrderID = sod.SalesOrderID
where
soh.CustomerID =@customerid
–we will start processing from the first salesorderid of the date range rather than first row of the table..
–and sod.SalesOrderID >= @minsalesorderid
end
go
–find the customer with minimum number of rows and max number of rows

–1477 IO’s and 118K rows processed for mysod table. Thus rows processed are simply determined by the max salesorderid
–lower this value lower will be the IO’s and lower will be the number of rows processed.

–Thus new technique is perfroming similar to the actual method in case of the min and max values are very much apart..
–But it performs superbly if the diff is quite less…

–Thus you could see that this method in worst case scenario will perform almost equal to the actual method
–but it could boost the perfromance by a magnitude for lots of scenario…

/*
Conclusion:
This is not to replace the nested loop join or any other costing method. This is just to show that
optimizer should start the scan from appropriate point for inner table rather than the first row while doing a merge join specially on already sorted data.
This suggestion is just to show that merge join alogorithm could be handled better to reduce the logical reads and cpu processing by determining
the start point of the scan for the inner table for merge join.

The workaround I have shown should not be used unless you do the following.
1. Make sure that other performance technique are used before going for this approach.
2. This technique requires code change as you have to find the minimum value of the clustered index key..
3. Before applying this technique benchmark performance of this method against the best plan and method for your queries.based on the result you can use this technique.Also this will work if inner table has a clustred index already or being accesses through other index ( on which we have join).
4. However, microsoft might provide an alternate for this in future then the code introduced will be redudant.
*/

4 thoughts on “Improved Merge Join algorithm in SQL server 2008”

Thanks dude, always awesome to find a clever new trick like this. I am already using it in anger – have shaved a few seconds off of a function which is itself called hundreds of times – perf. test is now running – can’t wait to see what it has netted me.

Quick question regarding merge joins; usually they result in tablescans on both tables, but every now and again, I do manage to bully the engine into performing merge joins with seeks, as happens in your example above. Do you know the set of conditions which the engine requires to be willing to do this? Does the filter predicate need to be separate from the join predicate, as it is in your examples, etc? I find that quite often getting neat stuff like this to happen is a bit hit and miss. What are the rules?

Why isn’t there a section on MSDN with a *huge* flashing heading reading ‘How to get seeks with the merge join’ or ‘how the MS SQL execution engine actually f**king works’. Why is this such a ‘black art’.

I have been a programmer for 15 years or so, but until a year ago, had I had to use a so much as a ‘group by … having’, I would probably have told my both my psychologist and my girlfriend about it. Then I broke up with my girlfriend, and around the same time, was forced by work to dig a little deeper into sql than before, and now I am an addict. I swear that I feel a rush of dopamine whenever I see that green-missing-index text at the top of an execution plan!

Actually, I think I get it now, having played some more. It seems to me like what I mean by ‘get merge join to seek’ is exactly what you described, almost anyway… It will seek to the first match, and from there, scan to the end of the table. Your trick is to stop it as early as possible (sort of a seek on both sides of the scan). That seems to be ask good as its going to get until MS implement some form of ordered hash join (which I believe is in fact possible – google knows).

Anyway, thanks again. Your article inspired a new approach which has cut 50% off of the run duration of the longest running query in our system. You rock.

You wont find it in MSDN.I even tried it in sybase as Sybase Optimizer is not clever enough as well.

Trick is simple. Optimizer knows where to end but it doesnt know where to start.All my article is about helping Oprimizer to tell where to start. This actually should be done by optimizer when it looks first time..However, this trick wont work when you do not have either clustered or non clustered index on the joining column.In that case anyway Optimizer will prefer the Hash join (in most of cases).

Also, you could provide the end as well by using the max. In that case you will save memory and maybe some physical IO’s as well.

Maybe Implementing it is not so simple and that is why MS or Sybase are not so keen on providing this. There is another trick which is quite similar to this but is related to non clustered index + book mark lookup. You can find it here.