How To Find Duplicate Files In Excel
It is an easy thing to find duplicate files in Excel. The following is a procedure you can use to find duplicates using an Excel’s easy–to-use filter feature.
- In the record set, select any cell.
- On the Data menu, click Filter, and then click on Advanced Filter so as to open the Advanced Filter dialog box.
- In the action area, Select Copy To Another Location
- Then, in the copy to control, enter a copy range
- Click OK after checking the Records Only
At this point, a filtered list of unique records will be copied by the Excel will to the range that you specified in Copy To. This is the point that you can now replace the original record set with the copied list just in case you want to delete the duplicate files.
Since finding duplicates across multiple columns or in a single column is a bit more difficult, you need to use conditional formatting in highlighting duplicates in a single column:
- On your work sheet, select the first data cell in the column
- From the Format menu, Choose Conditional Formatting
- On the first control’s drop-down list, Choose Formula
- Then click on enter
- Click the Format icon to specify the appropriate format.
The conditional format will help you select the files that have been duplicated. You can enter the formula- COUNTIF ($A$2:$A2, A2)>1 if you want to highlight the copies only without altering the first occurrence of the value.
The conditional format is definitely good for a single column. You can find duplicate files across multiple columns using two expressions: concatenating the columns you are comparing and the other one is counting the duplicates.
A space character can be inserted in between the two names if you wanted to do so; however, it is not a must. To accommodate the remaining list of files, Copy the formula.
From here, you need to enter this `=IF (COUNTIF (D$2: D$7, D2)>1’ formula and then copy to accommodate the other remaining list.
You will notice that the worksheet would be having a new record-row 6. This record will duplicate the first name but not the last one.
The conditional format will highlight the first name because of it being a duplicate of the first column. But the combined values in the second and third columns are not identified as duplicates due to the fact that the first and last names are not duplicated.