Hi there,
I've got a complex string field that looks like this:
No. | String Fields |
1 | LED Daytime Running Lights |
2 | LED Daytime Running Lights with Adaptive Control |
3 | Daytime Running Lights with Fog Lights |
What I wish to do is to group the string fields into separate clusters (Group A / Group B / Group C) based on a complex rule which may need to write some regular expression if necessary.
And a key-words sheet is also available to feed into the algorithm.
Contains(Key Words 1) | Contains(Key Words 2) | Not Contains(Key Words 3) | Group |
Daytime Running Lights | Fog Lights | A | |
LED | Daytime Running Lights | B | |
LED | Adaptive Control | C |
If similar string fields are added, what we need to do is to update the key words sheet without modifying the regular expression. (The Sting fields have millions of records so it is not practical to define a fixed group.)
Say if someone write "Daytime Running Lights" as "DRL", we just add DRL to the key words list so that the string fields can be grouped accordingly.
The find replace tool does not solve my problem because it uses a simple contains function.
Any help is appreciated,
Thanks!
Solved! Go to Solution.
hi @steve_tang
How many columns do you have in your keyword sheet? Is it only these three?
Also: according to the rules that you have defined, row 3 will not be assigned to any group. Is this because you haven't listed all the groups?
Dan
Hi @danilang
We have only 3 columns in the keyword sheet.
If the string field do not match any pattern in the keyword sheet. That will be assigned to a default group (Group 0).
The original sheet is in another language which is a little bit ambiguous
I have added some additional rows in the attached sample data so that you can understand the problem better.
Hi @steve_tang
This was a tricky one.
The left half of the WF, up to the three vertically aligned joins, prepares the source and look-up data. The three joins match on each of the key word columns. This is why I asked If you had only three of these. If you had a varying number, the logic would have been more complex, probably involving a macro or two. The right part of the macro performs the 1 and 2 but Not 3 logic and cleans up the results to be displayed as
Note that No. 2 matches 2 groups and No. 3 doesn't match any (default group 0)
Dan
Hi Dan,
Thank you very much!
This is exactly what I need.
Steve Tang