Hi all,
I am using the following statement: if contains([verbatim],"discover") then 1 else 0 endif, and I am picking up on other words such as "discovered" and "discovery". I am looking for a solution on how to edit this statement so that I do not get a "1" in my output for words like discovery, discovered, etc. I ONLY want to see a 1 for "discover" as-is.
Thanks!
hmmm - definitely a better way using the lookups - but this will work in the regex tool set to match:
.*\sdiscover\s|discover\s.*|.*\sdiscover|discover
It might be as simple as adding a space before and after (i.e. " discover "). If it's not that simple, then REGEX_Match is probably going to be required.
what do you mean by look ups? I am currently using the formula tool and just repeating the statement using "or" in between as I have about 100 other words I am also using this for aside from discover.
regex lookup - are these in sentences or are they solo? you can use what I've posted if they are in sentences. If they are individual words standardize for case (ie upper([field1])) and just use "=" instead of contains ie upper([field1])="DISCOVER"
If they are in sentences and you got the regex way - mark a solution and start a new thread so I can say "throw it in a batch macro and edit the match term (ie discover) in your batch macro."
On the topic of lookups, especially if you have 100 cases as you have mentioned previously, it may be a good idea to make a lookup table and use the Find Replace tool as a more powerful 'Contains' check. Here's a quick example of 3 strings containing 'discover' in some sense - 1 exact and 2 within alternative words as you mention above:
Now, I have made a lookup table for the word 'discover'. The [Match] field here will be appended if this is found within the string as you'll see in the next steps:
On to the Find Replace configuration... We look within our [Input] field i.e. where the string sits, for the target - in the [Lookup] field of the lookup table. It's important to note here that Match Whole Word Only is ticked, which will avoid things like discovery being flagged as discover, which is the initial problem. Finally, we select to append a field when a match is found, which is our [Match] field:
You'll notice that the 1 is appended to the final record which is the expected behaviour. Now, to get 0s next to the others it's just a case of a simple formula to replace the nulls:
Hope this helps to get you started. All you need to do to expand/maintain this is add all the words you're checking to the lookup table. You'll also notice that there's the option to be case-sensitive or not in the Find Replace tool which you can set as required.
@kayla_o Could use the RegEx tool, using Match:
RegEx: .*\bdiscover\b.*
\b is a word boundary.
.* first one means anything before
.* second one means anything after
Here's a link to play about with the RegEx:
https://regex101.com/r/7uKYpA/1
A line will be highlighted if it matches what you're after, in this case, contains the word discover (where discover is not connected to anything).