Hi, I have been attempting to use the fuzzy match tool for something, and I have had no luck. I tried to use RegEx formulas but apparently Fuzzy Match is the only tool in Alteryx that will return values that have close matches, such as 80% match.
Here are my details:
- I have an Excel (xslx) file that has two tabs. The first tab is called First Copy and it has 2 columns ("Names" and "ID"). The second tab is called Organization. In that tab, I have 3 columns ("Level", "Organization Name", "Date"). "Names" from the first tab and "Organization Name" from the second tab have the exact same information, but some of the wording is slightly different for some of the cells. Some are already exact matches. In the second tab, "Level" is the column that has the parent/child entity numbering. For example, here is what column A & B look like in the second tab:
| 1 | Marketplace LLC |
| 2 | Engine Limited |
| 3 | Gallery |
| 3 | Moon |
| 3 | Star |
| 3 | Dynamic |
| 2 | Chair |
| 3 | Email |
| 3 | Dataset |
| 3 | Custom Tools |
| 4 | Fuzzy |
- "1" means Parent entity. "2" means child entity, "3" etc means child of that child. So in this example, "Marketplace LLC" is the overall parent entity. "Chair" is a child of Marketplace LLC that has its own children, such as Email, Dataset, and Custom Tools. Then, Custom tools has its own child - Fuzzy. And there is a huge list of these.
Here is where I am getting stuck / what the ask is:
-In the first tab in my file, there is no "Level" column that links these together. So, I've been working on a small workflow. It needs to have some sort of join/fuzzy match that compares Column A from the first tab and Column B from the second tab (so just the names). If there is no close match, it returns a blank. At the end, I want it to return 2 columns (Org Names and First Copy). The parent entity with all of it's children and the children's children underneath should be returned. They are already in order in the file. There are also some cases where an entity is in the first tab but not in the second tab. That is when it should return blank in the column of the tab is wasn't found in. For example, if "Vision LLC" was found in the first tab but not the second, it should return "Vision LLC" in First Copy column and "Not found" or a null in the Org Names column.
- My Alteryx workflow so far is two inputs (for each tab), data cleansing tools on each. I tried various methods (appending/unioning/joining/etc) and then attaching the fuzzy match tool, but it never works. The tool itself is hard for me to navigate as well. Any help here would be greatly appreciated, thank you.