Dear Community,
I would like to ask if we can match the two data below into one data:
May i know if this is possible done in the alteryx?
Solved! Go to Solution.
Hi @SH_94
It took sometime. Here is how you can do it. Fuzzy match
Workflow:
1. Using formula tool I am converting all last name, first name formats to first name last name.
2. Using data cleanse tool to remove duplicate spaces and convert all to upper case.
3. Using fuzzy match to match the columns.
4. Using unique tool to get unique match rows.
5. Setting ID for matches
6. Tranposing match column to rows with ID as key.
7. Using find & replace to ID to the name.
8. Using ID to map 2 data sources.
Hope this helps 🙂
Dear @atcodedog05 ,
This is really smart application . I would like to clarify with you on the following concept:
1. Could you briefly explain the concept for the formula - REGEX_Replace([Employee name], "(.+),(.+)", "$2 $1") ? Why we need the $2 and @$ 1?
2. Would there be any impact if we choose lower case for step 2? Why we need to change it all to Upper case in this case?
3. For the Matchscore and MatchScore_format , normally how many percentage is acceptable ? ie more than 90% ? as i can see the threshold set is 80%. Based on your experience, may i know whether the 80% safe enough?
4. Could you briefly explain how is the following screenshot work? As in how do we know which one to choose for the field name?
5. May i know what is the difference between U and D in this case?
many thanks for your help.
Hi @SH_94
Happy to answer
@SH_94 wrote:
Dear @atcodedog05 ,
This is really smart application . I would like to clarify with you on the following concept:
1. Could you briefly explain the concept for the formula - REGEX_Replace([Employee name], "(.+),(.+)", "$2 $1") ? Why we need the $2 and @$ 1?
Few names were in last name, first name considering last and first name as groups "$1, $2" i need to have it in first name last name to genralize the format hence "$2 $1"
2. Would there be any impact if we choose lower case for step 2? Why we need to change it all to Upper case in this case?
No, it would still work. But both should be in upper case or lower case
3. For the Matchscore and MatchScore_format , normally how many percentage is acceptable ? ie more than 90% ? as i can see the threshold set is 80%. Based on your experience, may i know whether the 80% safe enough?
This is purely subjective towards usecase. 80% is default tool set standard.
4. Could you briefly explain how is the following screenshot work? As in how do we know which one to choose for the field name?
Format is the preped emp name which i want to use to check does it match with any other name. Record ID is identifiter so basiaclly its matching on format, mapping it and showing emp name
5. May i know what is the difference between U and D in this case?
U gives unique rows D gives duplicate rows
many thanks for your help.
Hope this helps 🙂
Dear @atcodedog05 ,
I was trying to use the workflow built and it does not apply to certain cases. After checking, i notice that i would need to remove some wording , as per example below:
Data 1
1. TAN KTN ING,@URU
2. KIRIN KTNIT WER KUNG TUNG
3. Pong Ktn Ine Ou
4. weu jin jie ktnit Pang
5. Ktn ,Derrick Tang
6.KTNBU KING
7.CHAO KAU JGN KTNIT
Basically i plan to remove all the wordings that consist of KTN or KTNIT ( regardless of capital letter- upper /lower/sentence case) and the result would be as follow.
Result wanted -1
1. TAN ING,@URU
2. KIRIN WER KUNG TUNG
3. Pong Ine Ou
4. weu jin jie Pang
5. ,Derrick Tang
6.KTNBU KING
7.CHAO KAU JGN
Is it possible we remove certain wording in the name and remain the same for others?
Many thanks in advance
Hi @SH_94
I checked the workflow with new given input data. Its working for me even with KTNIT and KTN
Workflow:
And here is how you remove these characters if required. Here I am saying remove KTNIT & KTN case insensitive. If you have another word to remove just add like KTNIT|KTN|NIT where NIT is the new word to be removed.
Hope this helps 🙂
Dear @atcodedog05 ,
I just tried one of the formula and it looks like not working. Perhaps due to the following issue: I checked yours and it look like normal. But mine one is grey and can't edit (cant choose the data type). May i know if you know how to fix this ?
Many thanks in advance.
Hi @SH_94
In my scenario i am creating a new column hence i am allowed to change data type and size. Since you are using formula on existing column data type option is frozen. To can use select tool before formula tool to set the desired datatype size or create a new column.
Hope this helps 🙂
Dear @atcodedog05 ,
Thanks a lot for the guidance and it works. May i know what are these two functions under the find and replace tools?
Thank you.
Hi @SH_94
As the heading says I am trying to find Value in Employee name. In this scenario I am trying to do an exact match join tool can also be used in this scenario. But I prefer find & replace since it works like a vlookup.
This short interactive lesson might help you more https://community.alteryx.com/t5/Interactive-Lessons/VLookUps-with-Designer/ta-p/80201#done
Hope this helps 🙂