How to compare two columns in Excel and delete duplicates (highlight, color, move)

It will take you about 10 minutes to read this article and in the next 5 minutes (or even faster if you choose the 2nd solution described in the article) you will easily compare two Excel columns for duplicates and remove or highlight the found dupes. Okay, the countdown is started!

Excel is a very powerful and really cool application for creating and processing large arrays of data. Now that you have lots of workbooks with a pool of data, or maybe just one huge table, you may want to compare 2 columns for duplicates and then do something with found entries, for example delete duplicate rows, color dupes or clear the contents of duplicated cells. These two columns may be located in one table, contiguously or non-contiguously, or they may reside in 2 different worksheets or even workbooks.

Say, you have 2 columns with people names - 5 names in column A and 3 names in column B, and you want to compare data between these two columns to find duplicates. As you understand, this is bogus data just for a quick example; in real worksheets you usually have thousands and tens of thousands of entries.

Variant A: Both columns are located on one sheet, in a single table: Column А and Column B

Variant B: Two columns are located on different sheets: Column A in Sheet2 and Column A in Sheet3

The built-in Remove Duplicate tool available in Excel 2016, Excel 2013 and 2010 cannot handle this scenario because it cannot compare data between 2 columns. Furthermore, it can only remove dupes, no other choice such as highlighting or coloring is available, alas :-(.

Further on, I am going to describe 2 possible ways of comparing two Excel columns that let you find and remove duplicate entries:

Compare 2 columns to find duplicates using Excel formulas

Variant A: both columns are on the same list

In the first empty cell, in our example this is Cell C1, write the following formula:

=IF(ISERROR(MATCH(A1,$B$1:$B$10000,0)),"Unique","Duplicate")

In our formula, A1 is the first cell of the first column that we want to use for comparison. $B$1 and $B$10000 are the addresses of the first and the last cell of the 2nd column that you want to compare against.Pay attention to the absolute cell reference - dollar signs ($) preceding the column letters and row numbers. I use the absolute reference on purpose, in order for the cell addresses to remain unchanged when copying the formula.

If you want to find dupes in Column B, swap the column names so that the formula looks like this:=IF(ISERROR(MATCH(B1,$A$1:$A$10000,0)),"Unique","Duplicate")

Instead of "Unique"/"Duplicate" you can write your own labels, e.g. "Not found"/"Found", or leave only "Duplicate" and type "" instead of "Unique". In the latter case, you will have empty cells next to cells for which duplicates were not found, I believe such presentation is more convenient for data analysis.

Now let's copy the formula to all cells of column C, up to the last row that contains data in column A. To do this, put the cursor to the lower right corner of cell С1, and the cursor will change to a black cross, as shown in the image below:

Click the left mouse button and holding it down drag the border downward selecting all cells where you want to copy the formula. When all needed cells are selected, release the left mouse button:

Tip: In large tables, it is faster to copy the formula using shortcuts. Click on cell C1 to select it and press Ctrl + C (to copy the formula to clipboard), then press Ctrl + Shift + End (to select all non-empty cells in Column C), and finally hit Ctrl + V (to paste the formula into all selected cells).

Awesome, all duplicated cells are flagged as "Duplicate":

Variant B: two columns are on different worksheets (workbooks)

In the 1st cell of the 1st empty column in Sheet2 (column B in our case), write the formula:

=IF(ISERROR(MATCH(A1,Sheet3!$A$1:$A$10000,0)),"","Duplicate")

Where Sheet3 is the name of the sheet on which the 2nd column is located, and $A$1:$A$10000 are the addresses of the first and last cells of that 2nd column.

Working with found duplicates

Perfect, we have found the entries in the first column (Column A) that also exist in the second column (Column B). Now we need to do something with them :)

It would be rather ineffective and would take too much time to look through the entire table and review the duplicate entries manually. There are much better ways.

Show only duplicated rows in Column A

If your columns do not have headers, you need to add them. To do this, put the cursor on the number indicating the 1st row and it will change to a black arrow as shown in the screenshot:

Right click the selected row and choose "Insert" from the context menu:

Give names to your columns, e.g. "Name" and "Duplicate?". Then switch to the Data tab and click Filter:

After that click a tiny grey arrow next to "Duplicate?" to open a drop down list, uncheck all items other than Duplicate in that list, and click OK:

That's it, now you see only those cells of Column A that have duplicated values in Colum B. There are only three such cells in our test worksheet, as you understand in real sheets there are likely to be more, far more of them:

In order to display all rows of Column A again, click the filter symbol in Column B that now looks like a funnel with a tiny arrow and check "Select all". Alternatively, you can do the same via Data tab -> Select & Filter -> Clear, as shown in the screenshot:

Color or highlight found duplicates

If the "Duplicate" flag does not suffice for your purposes and you want to mark duplicated cells by font color or fill color or in some other way…

Then filter the duplicates as explained above, select all filtered cells and press Ctrl + F1 to open the Format Cells dialog box. As an example, let's change the background color of duplicated rows to bright yellow. Of course, you can change the background color of cells using the Fill color option on the Home tab, but the advantage of the Format Cells dialog box is that it lets you make all formatting changes at a time:

Now you definitely won't miss a single duplicated cell:

Remove duplicates from the first column

Filter your table so that only cells with duplicated values show up, and select all those cells.

If 2 columns you are comparing are located on different worksheets, i.e. in separate tables, right-click the selected range and choose "Delete Row" from the context menu:

Click OK when Excel will ask you to confirm that you really want to "Delete entire sheet row" and then clear the filter. As you can see, only the rows with unique values are left:

If 2 columns are located on one worksheet, next to each other (adjacent) or not touching each other (nonadjacent), the removing duplicates is a bit more complex. We cannot delete entire rows that contain duplicate values because this would delete corresponding cells in the 2nd column too. So, in order to leave only unique entries in Column A, you do the following:

Filter the table so that only duplicated cells are displayed and select all those cells. Right click the selection and choose "Clear contents":

Clear the filter.

Select all cells in Column A starting from cell A1 up to the last cell that contains data.

Go to the Data tab and click Sort A to Z. In the dialog window that opens, choose "Continue with the current selection" and click Sort:

Delete the column containing the formula because you do not need it any longer, only "Uniques" are left there by now.

That's all, now Column A contains only unique data that do not exist in Column B:

As you see, it's not so difficult to remove duplicates between two Excel columns using formulas. Though it's very time-consuming and boring process to write and copy the formula, apply and clear the filter every time you need to compare 2 columns in your worksheets. The other solution I am going to bring to your attention is much simpler and will take just a fraction of time we've spent on the first method. I believe you will find more pleasant things to spend the saved time on ;)

Compare 2 Excel columns for duplicates using a visual wizard

Here are 10 quick and easy steps to compare two columns in your Excel worksheet to find duplicates:

Close all open Excel windows, extract the files from a zip folder, run setup.exe and the Setup Wizard will walk you through the installation:

Open the worksheet (or worksheets) where the columns you want to compare are located.

Select any cell within the 1st column, switch to the newly added Ablebits Data tab and click the "Compare Two Tables" button to start the wizard:

On step 1 of the wizard, you will see that your first column is already selected, so simply click Next.

Note. If you want to compare not simply 2 columns, but 2 tables by two or more matching columns, you need to select the entire first table on this step.

On step 2 of the wizard, select the 2nd column that you want to compare against, we choose Sheet3 from the list of sheets. The smart wizard will select the 2nd column automatically, if for some reason this does not happen, click on the Select range icon. (If you are comparing entire tables, select the whole 2nd table):

Choose to Find Duplicate values:

Select the pairs of columns you want to compare. Check the checkbook next to Column A and select Column A from the drop-down list in the right-hand part of the wizard, under Table 2 columns:

Tip. You can compare not only 2 columns but 2 tables with multiple columns. At that, you can compare by several pairs of columns at the same time, e.g. Column A in the 1st table against Column A in the 2nd table, and Column B in the 1st table against Column D in the 2nd table, etc. For more details, please see How to remove duplicates from two Excel spreadsheets.

And finally, you are to decide what you want to do with found dupes. You can choose to delete the duplicate entries, move or copy them to another worksheet, add a status column (the result will be similar to our first solution with Excel formulas), highlight duplicates, or simply select all cells with duplicated values:

Tip. Do not choose to delete duplicates especially if you are using the wizard for the first time. You'd better choose to move dupes to another worksheet, because when you work with real data you may want to cast at least a quick glance at the entries recognized as duplicates, especially if you are comparing by several columns in large tables. It may happen that you accidentally forgot to select a certain column when specifying matching columns, and if that column contained unique data, it would be really frustrating to irretrievably lose it.

Click Finish and enjoy the result. What we have now is a nice, clean table with no duplicates:

Remember the previous solution and feel the difference :) It is fast and easy indeed to dedupe your worksheets using the Duplicate Remover add-in. In fact, it will take you less time than you've spent on reading this article. The clock is ticking, so hurry up and download Excel Duplicate Remover right now!

If you have questions or something is left unclear, please drop me a comment and I'll happily elaborate further. Thank you for reading!

If you want to quickly find the previous occurrence, the easiest way is apply Autofilter to your table and filter according to the necessary value when it is needed. This variant is more convenient then using Find All in the column.

Thank you for this guide. Using the formulas, is it possible to after comparing the two columns and identifying that the value is unique, instead of returning a text value of "unique" or "duplicate" to return the unique value itself?

these formulas are very helpful, but does anyone have a way to find duplicates in two columns when the data isn't an exact match? I'm comparing ISBN's, and column a contains multiple values in each cell. Column b contains one value (ISBN) per cell. So, I really need to find cells in column a that partially match the value of cells in column b. Make sense?

Hi,
I have two columns as below. Would like to know how I can get the unique data from these two columns in a third column. I have provided below the sample data for merging and the expected data as well. Sample data is in Column A & B while unique data should be in Column C.

I have 2 columns with some duplicate data within one column and need to eliminate the duplicated portion of the second column. Can anyone please tell me how to do this without manually doing so? See below example; the S.A. Comunale Co. needs to be removed from the second column

First Column Second Column
S.A. Comunale Co. Mr. Art Pecca Comunale Co.
United Way Of St. Paul Mr. Sam Gild United Way Of St. Paul
Basley's Point Sea-Doo Mr. Paul Innoce Beesley's Point Sea-Doo

Please help! I have data in 2 columns on same worksheet. There is a portion within the second column that is duplicate of the first and I need to eliminate the portion that is duplicate of the first. Is there a way to do this without having to go through the excel spreadsheet manually and do?
See example below:

HI Alex, I am not sure if I am missing or not understanding a step but I am trying to compare addresses. I have set up my addresses on two different sheets in Excel and I see it populate the duplicated lists but when I compare my original list to the list that shows duplicates, I see that not all of the addresses are accurately marked. Do you have any additional pointers?

For me to understand the problem better, please send me a small sample workbook with your source data and the result you expect to get to support@ablebits.com. Please don't worry if you have confidential information there, we never disclose the data we get from our customers and delete it as soon as the problem is resolved.
Please also don't forget to include the link to this comment into your email.

Post a comment

Unfortunately, due to the volume of comments received we cannot guarantee that we will be able to give you a timely response. When posting a question, please be very clear and concise. We thank you for understanding!