Hi,
I have data from two different apps (the appointment sheet and the check-in sheet in the attached excel file). People are supposed to make appointments and get approved, then check-in in the same day. They can submit multiple entries in both apps. I need to get a list of people who checked in without approved appointment within the same day, and people who got approved appointment but didn't check in the same day. I am not sure if this is fuzzy match or fuzzy join. I will appreciate if someone can help me solve this problem.
Thank you very much!
Solved! Go to Solution.
I assume you're trying to decide if you want to use a fuzzy match or a join. The difference between the two is mainly whether or not you have exact matches in your dataset.
For example, let's say you have email addresses in both datasets. There really can't be differences in the values aside from case (which can be easily normalized with the data cleansing tool). Something like a person's name can have material differences in it each time it is typed (spaces, punctuation, misspellings, etc.).
If you have exact matches you can simply use the Join tool to see which names are common between the two datasets. If you have non-exact matches, that's when you'd want to use the fuzzy match tool. Even with the fuzzy match tool though you will need to use a join following. The fuzzy match will essentially find similar values in your dataset.
https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Tool-Mastery-Fuzzy-Match/ta-p/45485
It looks like there's not really variability in your dataset, so you can just use a join. See attached for example.
Hope this helps!
Thank you for your reply. I need help on joining the sample dataset based on date and (first name, last name). The date needs to be exact match, while the name is fuzzy match.
Hi @echuong1
Thank you for looking into my data. In the first join (J) output, I would need to remove #2 and #3, because appointment and join has to be the same day for each person. Do you think I should add an calculation on this output to filter out #2 and # ?