0% found this document useful (0 votes)
482 views3 pages

Steps To Use Fuzzy Lookup in Excel

This document provides steps to use Fuzzy Lookup in Excel to identify potential duplicate records in a dataset. It involves creating a data table, running Fuzzy Lookup to generate matches, filtering the output to identify true duplicates, and inserting columns in the original data to flag potential duplicates and their similarity scores. The process allows identifying duplicate records with similar but not identical values.

Uploaded by

SJ SJ SJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
482 views3 pages

Steps To Use Fuzzy Lookup in Excel

This document provides steps to use Fuzzy Lookup in Excel to identify potential duplicate records in a dataset. It involves creating a data table, running Fuzzy Lookup to generate matches, filtering the output to identify true duplicates, and inserting columns in the original data to flag potential duplicates and their similarity scores. The process allows identifying duplicate records with similar but not identical values.

Uploaded by

SJ SJ SJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Steps to use Fuzzy Lookup in

Excel
1. Open data set in Excel

Data can be in CSV format, or an Excel Workbook

2. Create data table in Excel


1. Select range to define your data table
2. Press to [Ctrl + L] to activate the Create Table wizard, click Ok
3. Give the data table a name, e.g. DATA_TABLE

Figure 1: Create data table from selected region

Figure 2: Naming the data table

3. Fuzzy Lookup Sheet


1. Add a new sheet to the Workbook
2. From the Fuzzy Lookup Ribbon, click the Fuzzy Lookup button
3. Select DATA_TABLE as both Left and Right tables to compare

4. Select C_CUST_NAME for both Left and Right columns to match


5. Click the Configure button
6. Adjust the EditTransformationThreshold value to 0.75 for the Default
column configuration and click Ok
7. Increase the number of matches to 5
8. Increase the similarity threshold value to 0.7
9. Click the Go button to generate the Fuzzy Lookup output

4. Identifying potential duplicates


1. In the output sheet, hide all columns except the two C_CUST_ACCT
columns (these are from the left and right tables) and the Similarity
column.
2. Create another column called SAME? and enter the formula:
=H2=CM2. Fill for all rows
3. Add a filter to the columns:
a. Filter out TRUE rows for the SAME? column
b. Sort by the Similarity column in descending order
4. Copy the remaining rows to a separate sheet
5. In the main data sheet, insert two columns to the right of column H
a. Name the first column Potential duplicate?
b. Name the second column Similarity
6. In the Potential duplicate column enter the formula:
=IFERROR(VLOOKUP([@[C_CUST_ACCT]],Sheet3!$A$1:$C$125,2,FALSE),"")
Fill all rows
7. In the Similarity column enter the formula:
=IFERROR(VLOOKUP([@[Potential Duplicate?]],Sheet3!
$A$56:$C$96,3,FALSE),"")
Fill all rows
8. Apply conditional formatting to the Similarity column
a. Select Data Bars > Gradient Fills > Red Data Bar

Final Output

You might also like