I have to compare two columns in two different Excel sheets and delete the duplicates. I have come up with several strategies that work, but I am searching for the most efficient way to do this. At first, I was sorting the columns and looping through them in the manner of a sequential update and deleting when the cell values were equal. But I discovered here an article by Mike Rosenblum that explained that it is much more efficient to stick the columns into an array and iterate them from there. So I was able to do that. There was a complication in that once I had deleted a row, then the array and the spreadsheet were out of sync. So I had to compensate by incrementing a variable each time a row was deleted, and subtracting this variable from the array index to calculate the row to be deleted in the spreadsheet.

So now I am wondering whether there is a better strategy. I thought that if I could merely hide the rows to be deleted and then find a way to delete all hidden rows at once after the loop was over, then this would eliminate the problem of the array and the spreadsheet getting out of sync and would make the code easier to understand. I thought of putting the row numbers of the duplicates into another array and deleting the rows after the comparing loop. But this creates the same problem of having to compensate the row numbers in the array.

So what I am looking for is a way to delete all the rows that need to be deleted at once without having to loop. Does anyone have a strategy I could use to solve this problem in a more efficient manner?

Thanks, I found the answer to my question elsewhere (on the Excel VBA board). The answer is to sort the worksheets on the key field in Descending order and iterate the array from the bottom up. When a row is deleted from the sheet during the iteration, the rows are shifted upward. That leaves the numbering of the remaining rows above in the same position they were before thus eliminating the problem of having to compensate for deleted rows.

Here is the code to demonstrate:

Code:

'Put the Listing Numbers of MainList and Act-1 into arrays and iterate them from there. This prevents the overhead of creating a
'Range Object for each cell in the iteration and speeds up execution considerably
With oMLSheet
oRng = .Range(.Cells(2, 1), .Cells(iFinalRow, 1))
End With
MainListArray = CType(oRng.Value, Object(,)) 'Fill MainListArray with Listing Numbers from MainList Sheet
iColumn = CInt(oAct1Sheet.Range("G2").Column) 'Column "G" = 7
'Sort Act-1 on Listing Number Descending (higher numbers at the top
iFinalRow = FindLastDataRow(oAct1Sheet)
SortSub(oAct1Sheet.Range(oAct1Sheet.Cells(1, 1), oAct1Sheet.Cells(iFinalRow, iColumn)), oAct1Sheet.Range("A1"), XlSortOrder.xlDescending, _
XlYesNoGuess.xlYes, XlSortDataOption.xlSortNormal, XlSortOrientation.xlSortColumns)
With oAct1Sheet
oRng = .Range(.Cells(2, 1), .Cells(iFinalRow, 1))
End With
Act1Array = CType(oRng.Value, Object(,)) 'Fill Act1Array with Listing Numbers from Act1 Sheet
iMainList = MainListArray.GetUpperBound(0) 'We are going to start at the bottom and work up
iAct1 = Act1Array.GetUpperBound(0)
'Read Act1Array to Beginning from end
Do While iAct1 > 0 'Loop over the arrays from bottom to top; Array is 1 based
If MainListArray(iMainList, 1) < Act1Array(iAct1, 1) Then
iMainList -= 1
ElseIf MainListArray(iMainList, 1) > Act1Array(iAct1, 1) Then
iAct1 -= 1
Else 'They have to be equal
oAct1Sheet.Rows(iAct1 + 1).Delete(Shift:=XlDirection.xlUp) 'IAct1 +1 because the array is 1 based
txtItemsDeleted.AppendText(oAct1Sheet.Range("A" & (iAct1 + 1).ToString()).Value.ToString() & vbLf)
iMainList -= 1
iAct1 -= 1
End If
'In case last entry in Act1 is greater than last entry in MainList; Otherwise MainListArray index would go out of bounds
If iMainList < 1 Then Exit Do
Loop