Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy match, Duplicates

gmamtani
8 - Asteroid

Hello everyone, I'm working on a process to update our main hierarchy. We have two sample workbooks for this.

  1. Raw Data Input: This workbook contains details like Vendor, Product, Buyer, About Buyer, and Dates. From this raw data, we create two hierarchies: Competitor hierarchy and Product hierarchy.

  2. Hierarchy Input: This workbook includes fields such as Competitor Code, Competitor Name, City, and Country or Territory.

Here's what we do:

  • We clean up the Vendor (Competitors' Names) field to find unique names and use them for the hierarchy. Then, we assign Competitor Code, City, and Country or Territory.
  • We also identify unique competitor products from this sheet to create the product hierarchy.

We download these datasets once a month. However, we're facing a challenge. The data provider changed a company's name, and now there are duplicates in the hierarchy. For instance, you'll see "SpotX" in the hierarchy, but in the raw data, it's listed as "SpotX (by Magnite)." The product names are also the same. I know these are the same company, but I need to automate the process to handle these situations.

I'm seeking help to resolve this issue and streamline the process. Your assistance would be greatly appreciated.

 

Let me know if there is any confusion, i have tried my best to explain the situation. 

1 REPLY 1
Felipe_Ribeir0
16 - Nebula
Labels
Top Solution Authors