I have a query that is supposed to do batch inserts of TOP 5000 until @RowCount = 0. However, this loop just keeps going and going and going and...

Im trying to take data from a staging table and move only certain fields to a destination table, and each record needs to be inserted just one time. Fields may have duplicat values, an entire row must be unique (imagine if an entire row was concatenated).

Basically I want to go down the list of a source table, and batch insert certain fields into a destination table until @rowcount = 0.

The query only stops executing because it hits the database size limit! The source table has 7 million records and the destination table has 15.2 million records. This can't be the case unless the query is wrong and doing an INSERT for repeated data.

The query simply selects the top 5000 rows from the source table every time it goes through the loop. So you have to do something to make sure that the rows that have been visited do not get picked up again.

An efficient way to do this might be to use SSIS or Import/Export wizard, rather than trying to construct your own loop. You can control the batch size when you do that, and it is very efficient at moving data from one table to another. To use Import/Export wizard, right-click on the database name in SSMS object explorer and select Tasks -> Import Data and follow through the wizard's instructions.

What you have there will just keep inserting the TOP 5000 random rows from DB_SourceName. You need some way to limit the rows. There are several ways you could do that. One might be to use a ID, assuming the table has one or to use the ROW_NUMBER() funciton to generate a number. Antoher way, is to use an OUTER JOIN or NOT EXITS on the destination so you won't insert duplicate rows.. etc..

What you could do is to add an identity column to staging table and then loop based on that. i still didnt get reason for doing 5000 inserts per batch. Is it because of large data volume? At the end of each population you may truncate the table and prepare it for next data load so that identity will again get reset.

I tried the SSIS route, and it was so easy to set up, but unfortunately it errored out after the first 1,071 rows and I can't figure out why. I even have a data conversion process between the source and destination (nvarchar to int), and the process stops there.

So no I'm running the process with the query above in batches of 5000, but using a left join, ON source.XYZ = dest.XYZ where dest.XYZ IS NULL;

This query is running super slow. I'm on the 45 minute mark and it's not even half way through! Any suggestions? I have to do it in batches by the way because the DBA's got a little PO'd :D

Any of Field1, Field2 and Field3 is unique? or a combination of any two? or all three together makes it unique for a single record? If none, then is there any unique key/field in the DB_SourceName table? if source table does .. then use its unique key for looping purposes