VB.Net - Designing Duplicate Search Across Multiple Databases

Good Day Experts:

In my VB.Net project I am having a speed/performance issue with my Duplicate record search. I need to find a better way to do it and I hope you can help.

I have 5 SQL databases each representing an Account. All 5 of them belong to the same Company. Accounts are processed 1 time a week on Friday. I need to take ID's from Account 1 and look those ID's in all of the data in Account 2, Account 3, Account 4 and Account 5 to see if they have been used before for a given week. For example, today is 5/8 so I to take the ID's for Account 1 for the week of 5/8 and search through Account 2, Account 3, Account 4 and Account 5 looking for re-used ID's. Then I have to take the ID's for Account 2 and search all of the other Accounts. Then repeat for the remaining Accounts.

I have the looping constructs in place to handle all the Account iteration. I also have my query in place to look for the duplicates. The problem is the performance is very slow on the query...then performing it multiple time is causing trouble.

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

My first question is why are their duplicate records for sales? Is it just that they have the same ID numbers and something should be changed to make those universally unique?

If not, my next question is do you have the Pro Number indexed? Also any function calls probably increase the time--not sure how much and depends on number of records.

0

Jimbo99999Author Commented: 2015-05-08

Hello Thanks for replying.

We receive this shipment information from the actual Company so we are doing this 3rd party. As a result, we are checking to make sure the shipment information is no duplicated on "accident" in that shipment information.

I'm still a little confused. Do you compare ID's in a table in one SQL Database with a table in another SQL Database because nothing suggests this from the query shown. Or are you just checking in one database for multiple usages of a single ID and then repeating the process in each database? Or is it actually multiple tables?

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

So none of this "[Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER]" is a uniquely identifying value?

My thought is that it is not able to utilize quicker indexing because of all the where values that must be met.
I would try joining the tables on a single ID number that uniquely represents the order, make sure that field has indexing set up for it on both ends and then add the other parts in the where clause and see if that makes it any faster.

MySQL has an EXPLAIN statement you can add in from of the SELECT to see how it is getting values, whether the brute force (long way) or via index, etc. I don't think SQL Server has that feature.

I read online where some says to:
" Execute thecommand SET SHOWPLAN_TEXT ON and then execute the query. This will display the query execution plan in text format." So you might check on the SET SHOWPLAN_TEXT ON and see how it is gathering your values.

Via indexing where it makes sense (by adding as needed) will improve many situations the most. If you don't have to CONVERT() call a function each time, that might improve performance too.

Yes, there's a better way. Avoid using function as CONVERT since they are performance killer specially when used on indexed fields and try to join fields to reduce the number of returned records instead of bringing them all. So, this is my suggestion: