Can Alteryx compare data sets where, there is a mismatch between some names like
1. Accenture PLC or Accenture Public Limited Company
2. SG Fund or Son Gate Funds.
That's a very big mismatch.... if you're talking about similarity rates and checking, then you can use Fuzzy Matching on that.
But if you're talking pure matching with two data sets, then you can try with a a Join tool to see what matches in J, and what doesn't in R / L.
Yes, Alteryx Designer can be used to compare datasets for matches with high similarity scores for fields like names, even when there are variations or mismatches in the names. Alteryx provides a variety of tools and functionalities that can help in data preparation, cleansing, and matching.
To compare names with high similarity scores, you can use tools like:
Fuzzy Match Tool: This tool allows you to compare and match strings based on their similarity. It considers variations in spelling, typos, and other discrepancies. You can adjust the similarity threshold to capture matches with high similarity scores.
Join Tool with Fuzzy Matching: You can use the Join tool and configure it to perform a fuzzy match on the name field. This way, you can join records that have similar names even if they are not exact matches.
Here's a basic outline of the steps you might follow:
Input Data: Bring in your datasets into Alteryx.
Data Cleansing: Cleanse the name fields to standardize the format (e.g., removing extra spaces, converting to uppercase).
Fuzzy Matching: Use the Fuzzy Match Tool or configure the Join tool to perform a fuzzy match on the name field. Adjust the similarity threshold to capture matches with high similarity scores.
Output: Review the output to identify matched records and explore the similarity scores.
Keep in mind that the effectiveness of fuzzy matching depends on the specific use case and the nature of your data. You may need to experiment with different settings and tools to achieve the desired results.
Hi Hammad,
I like your answer and would like to have workflow that illustrates this steps . I am trying to compare dataset for various data sources for a match with high similarity score for specific named field.