Find Duplicates in Multiple Worksheets

28
692

The ExcelHelpDesk Support team received the following request for help

Problem – Duplicate Names in multiple Worksheets

What is the best way to find duplicates in a workbook with over 40 worksheets? This workbook has over 5,000 names separated into different groups (worksheets). We use this list as a basis for invitation lists. We usually have over 100 names duplicated and are trying to find the best way to highlight the duplicates – we have tried many different methods in the past, but none that finds all the dupes. Please help…

For this problem we received a copy of the workbook from the subscriber. We then developed a specific macro that went through each of the worksheets and determined if any names included were duplicated. Find below a screen shot of the existing lists and worksheets. Then the code that was developed to find the duplicates.

You can use the same code in your own workbook if you have a similar problem. Let us know if you need help in applying the sample code to your situation.

Example of Names and Worksheets with Duplicates

See below an example of the Names provided for Duplicate comparison.

and a screenshot of the Worksheets provided in the sample

Step 1 – Determine approach to find Duplicates

A couple of approaches were possible to find these duplicates, the first involved transferring all names and details to a single worksheet and then re-sorting the list. This would then identify the duplicates by repeated entries close together. The second approach and then one we decided to implement for this post involved the creation of a new worksheet called the “Duplicate Report” on this worksheet each name is included and if a duplicate entry is found a count is increased and the worksheet on which it was found is added to the list of worksheets for that name.

This method allowed for the extension of the workbook and re-use in future years.

Step 2 – VBA Code to search for and report Duplicates

Now here is the VBA Code developed to search for and create the duplicate report. This can be re-used for similar applications in other workbooks. The code is documented to show how each of the key functions are implemented and how the result is dervied.

Sub FindDuplicates()
'This routine will go through each worksheet in the workbook and if
'the worksheet contains names in the expected format, those names will be compared
'to the existing list, if the name is not found it will be included to the list
'if it is found the count will be increased and the worksheet name added to a list of
'worksheet names for that name

Dim myOutputRange As Range
Dim myRange As Range
Dim mySearchRange As Range
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim intNumberRecordsRead As Integer
Dim intNumberOfDuplicates As Integer
Dim myWorksheet As Worksheet
Dim blnFoundDuplicate As Boolean

Const strStartRange = "A1"
Const strStartText = "Title"

'First lets clear the previous report results
Set myOutputRange = ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportStartHeading")
Set myOutputRange = Range(myOutputRange.Offset(1, 0), myOutputRange.Offset(1, 0).End(xlToRight))
Set myOutputRange = Range(myOutputRange, myOutputRange.SpecialCells(xlLastCell))
myOutputRange.Clear

ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportLastRunDate").Value = "Running Report....Wait"
ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportNumberOfRecordsRead").Value = 0
ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportNumberOfDuplicates").Value = 0

'Now lets go through the workbook and find each of the worksheets in the right format to create the duplicate list
Set myOutputRange = ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportStartHeading")
Set mySearchRange = ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportStartHeading")
i = 1
For Each myWorksheet In ThisWorkbook.Worksheets
If myWorksheet.Name <> "Duplicate Report" Then
If myWorksheet.Range(strStartRange).Value = strStartText Then
Set myRange = myWorksheet.Range(strStartRange)
j = 1
Do While myRange.Offset(j, 1).Value <> ""

'Search for the record if it already exists then update the counts
k = 1
blnFoundDuplicate = False
Do While mySearchRange.Offset(k, 1).Value <> ""
If mySearchRange.Offset(k, 8).Value = myRange.Offset(j, 0).Value & " " & myRange.Offset(j, 1).Value & " " & myRange.Offset(j, 2).Value Then
mySearchRange.Offset(k, 6).Value = mySearchRange.Offset(k, 6).Value + 1
mySearchRange.Offset(k, 7).Value = mySearchRange.Offset(k, 7).Value & ", " & myWorksheet.Name
intNumberOfDuplicates = intNumberOfDuplicates + 1
blnFoundDuplicate = True
Exit Do
End If
k = k + 1
Loop

'If the record did not already exist then write it to the report
If Not blnFoundDuplicate Then
myOutputRange.Offset(i, 0).Value = myRange.Offset(j, 0).Value
myOutputRange.Offset(i, 1).Value = myRange.Offset(j, 1).Value
myOutputRange.Offset(i, 2).Value = myRange.Offset(j, 2).Value
myOutputRange.Offset(i, 3).Value = myRange.Offset(j, 3).Value
myOutputRange.Offset(i, 4).Value = myRange.Offset(j, 4).Value
myOutputRange.Offset(i, 5).Value = myRange.Offset(j, 5).Value
myOutputRange.Offset(i, 6).Value = 1
myOutputRange.Offset(i, 7).Value = myWorksheet.Name
myOutputRange.Offset(i, 8).Value = myRange.Offset(j, 0).Value & " " & myRange.Offset(j, 1).Value & " " & myRange.Offset(j, 2).Value
i = i + 1
End If

j = j + 1
Loop
intNumberRecordsRead = intNumberRecordsRead + j - 1
End If
End If
Set myOutputRange = Range(myOutputRange, myOutputRange.End(xlToRight))
Set myOutputRange = Range(myOutputRange, myOutputRange.SpecialCells(xlLastCell))
With myOutputRange
.Cells.Sort Key1:=.Columns(7), Order1:=xlDescending, Header:=xlYes, _
Key2:=.Columns(3), Order2:=xlAscending, _
Key3:=.Columns(2), Order3:=xlAscending
End With
Set myOutputRange = ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportStartHeading")
Next

ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportLastRunDate").Value = Now()
ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportNumberOfRecordsRead").Value = intNumberRecordsRead
ThisWorkbook.Worksheets("Duplicate Report").Range("DuplicateReportNumberOfDuplicates").Value = intNumberOfDuplicates

MsgBox "Duplicate Report has Completed", vbInformation, "Report Complete"

End Sub

Step 3 – Run the routine and review the Results

Now the routine can be run via the Tools > Macros options or you can create a Command Button on the worksheet to run the routine. Refer below to sample output from this routine to list duplicate names found in the workbook.

Here the report shows the names for each of the entries in the worksheets within the workbook. It also reports on the worksheet in which the name was found and how many times it was found.

Following many requests for assistance on this post we have now provided a link to the sample file so that you can apply this routine to your own situation. Click on the following link to download the sample file

Download Sample File

If you have a question on this post for the Excel Help Desk team or have something you would like to share on this topic then please leave a comment