Removing duplicates from a table

So in my day job I was asked to investigate why some data was not matching up. After a bit of investigation I found a table that had 90-95% of it’s records duplicated, so I needed a way of removing them quickly. That’s where the following code snippet came in handy:

Removing duplicates from a table

Transact-SQL

1

2

3

4

5

6

7

8

9

10

;WITHDuplicatesAS(

SELECT

FirstCol,

SecondCol,

ROW_NUMBER()OVER(PARTITIONBYFirstCol,SecondColORDERBYDateValue)ASRN

FROM

MyTable

)

DELETEDuplicates

WHERERN>1

It’s a simple CTE that assigns a row number to each record. We can then issue the DELETE command to any row number over 1. If you have a DATETIME column, you could keep the latest value if you wanted.