This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have 2 columns side-by-side. They each include a name but in differing name formats as shown in the attached excel. Can anyone here help me with the login to add a third column that would tell me whether or not the names agree even though they're in differing formats?
- Unify the case - Using the Data Cleansing tool I turn both columns into Title Case
- Then we need to identify how many names each person has and only take the first and the last for both columns. I split the data and use a combination of the Text-to-columns tool to split the data into one row per name and then count how many there are using the Multi-row formula
- I then use the Summarize tool to get the Min and Max (First and Last names) and join the data back.
- I then bring my columns side by side again and compare them using a simple conditional
Fuzzy match might work best here. Please see the attached workflow and let me know if this helps!!
The challenge with Fuzzy, is you need to have what you're matching in the same column, hence the split (with select) and union. I'm concerned with the threshhold being set so low (if you click Edit in the Fuzzy Match tool, you'll see it's set to a 70% match threshhold.
Let me know if this helps! I'll continue to noodle on this