I have an input data in which I have one particular field like mentioned below:-
Project Name
New event
Offsite workshop
CEO fireside chat
I also have a list of 80 keywords which I want to check in the above-mentioned field whether those keywords are in the field or not.
I know one solution where I can use a filter tool and use contain a formula for all those 80 keywords but that will be a huge manual task to do as I have to write contains for each 80 words.
Is there any shortcut way to do the same?
Sample of the list of keywords
Event
Workshop
Fireside.
Please let me know if any further explanation is required.
Thank you in advance.
Solved! Go to Solution.
@vsheth312
you can use the "IN" function
[New event] IN ("Event","Workshop","Fireside")
hope this helps.
Assumption is that this is not an exact match (or you'd use IN or a join to segregate out or in matches). so if you are looking for contains style search (or regex_match) - you can create a batch macro where you send your main datastream in - and your match terms becomes a control parameter. the control paramter values change your terms in your filter. you use a formula tool with a second action tool to flag which term was matched. and then you use an output data tool.
If you’re certain of the keywords and they don’t have many repetitions across - you can consider REGEX_CountMatches([YourField], “A|B|C|D”)
A B C D can be your keyword and the | is the OR statement. So you don’t have to write so many OR statements with CONTAINS.
You can also explore the batch principle above by @apathetichell that utilizes the control parameter given a set of words. It then cycles through them and gives you what you need.
Borrowed from @caltang - best way I can think of.
your match terms feed into a summarize tool. summarize tool concats them as configured here (basically to create a list of .*{value0}.*|.*{value1}.*|.*{valueN}.* - this list is then appended to data stream. formula tool does regex_match from this list.
Thanks, everyone for your input :)