Hi all,
I have a file that looks similar to the following example:
RecordID Comments name value
0 Franco Moro is cool Franco Moro 100
0 Franco Moro is cool Franco Minione 67
1 Jane and Jon Mare Jane Love 63
1 Jane and Jon Mare Jane Franklin 65
1 Jane and Jon Mare Jon McDonagh 71
1 Jane and Jon Mare Jon Mare 100
Now I want to apply the logic that if a name has a value of 100 and there is another record in the same group with a lower score and either the first name or the surname is equal than it should be flagged as OUT. However, if there are many options on the same name but none scores 100 then it should be flagged with a "?".
RecordID Comments name value Flag
0 Franco Moro is cool Franco Moro 100 OK
0 Franco Moro is cool Franco Minione 67 OUT
1 Jane and Jon Mare Jane Love 63 ?
1 Jane and Jon Mare Jane Franklin 65 ?
1 Jane and Jon Mare Jon McDonagh 71 OUT
1 Jane and Jon Mare Jon Mare 100 OK
Can anybody help?
Thanks,
Michael
Solved! Go to Solution.
Hi @MA74,
Workflow is attached, The crucial part of getting towards your desired output is utilising the Multi-Row Formula to convert your logic into a series of formulas based on some pre-processing with the Sort tool.
Identifying First names & Surnames to identify repeated names within each RecordID and then using the below formula in the Multi-Row Formula tool returns the desired output.
Some considerations here are how to handle the same logic when applied to Last Names and how to consolidate the results across First Name + Last Name flag processing if the data you intend on applying this to is significantly larger.
Hope this helps!
Hi @Imorell,
Thanks for your suggestion.
Do you know if there is a way to split the names to columns which recognizes what is likely to be part of the first name and what belongs to the surname? For instance: Marco van Basten -> Marco van Basten
Gian Marco Crüzer -> Gian Marco Crüzer
Other then this your workflow does exactly what I was looking for. Thank you!
I implemented also a parallel stream on the last name and then merged them so that both parts are checked.
Glad to hear the solution was helpful 🙂 Unfortunately I don't have any consistent way to split out first name and last name - it would be ideal to have it as a feature of the dataset but from personal experience that's often easier said than done.
However there's a good thread here that covers using RegEx statements to parse out the names based on space locations which might get you a step or two closer than just using a Text-To-Columns Tool.