Hi all,
Here are some facts about the data set simplified
Has a Label column that was free tex, so many people entered slightly different things
I want to organize this mess bz sorting them into the following Labels
1. No Action
2. Client request
3. Report issue
4. Data issue
5. Operations request
Also some people might have put two labels into the same cell but I of course only want one label. The numbering starting with 1. is therefore also the Hierachy say a cell has NoAction, report issue. I want the new cell to show No Action.
It would be great if somebody could help on how to solve this problem, I know it is not the easiest.
Thanks very much!
Below find data examples
Next I will input some Label examples, that it would need to handle
reports-issue |
reportissue |
data-issue, reportable_rule |
data-issue |
data-driven |
clientrequest |
Report-Issue |
Report-Issue, data-issue |
Operations-Request |
Operations-Request, data-issue |
I used fuzzy match and it works for almost everything/ But the Hierarchy is of course not implemented and it doesn't show all labels unfortunately. Grateful for any help!
I'm not sure what pre-prep you did before fuzzy match, but the way you would most likely want to go about this is:
Fuzzy Match is usually an iterative process as you want to set the limits at a level not to get false positives.