Text Mining Alternative for Contains
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I am using a formula tool that has various topics that I'd like to tag (example below)
Column 1
(if contains (if Contains([VERBATIM],"limit")
or Contains([VERBATIM],"discover")
or Contains([VERBATIM], "interest") then 1 else 0 endif
The problem is, the output is giving me 1's for words like "discovered" and "limited" but I only need the exact words.
I also tried the find and replace tool, but that doesnt work for me either as it only seems to pick up the first word in the cell, and not the exhaustive list of keywords in "column 1"
Any suggestions would be greatly appreciated!
- Labels:
- Text Mining
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Can you simply do:
if [VERBATIM]="limit"
or [VERBATIM]="discover"
or [VERBATIM]="interest" then 1 else 0 endif
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That does not work for my situation as my field [Verbatim] contains an entire sentence, so doing it that method only returns the records that have the exact match in the [Verbatim] column.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Understood - then try adding spaces around your words to isolate them:
(if contains (if Contains([VERBATIM]," limit ")
or Contains([VERBATIM], "discover ")
or Contains([VERBATIM], " interest ") then 1 else 0 endif
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alexnajm
I would use same...though cautiously. I've done so before and had misleading results as doing so will not identify the isolated word if it's the first or last word, preceded or superseded with a comma, period or other character. If a use case allows for some exception or the data is unlikely to have the words in such way, fine...otherwise consideration should be given to use of Regex, reverse string, etc.
VERBATIM Match
we are not limited 0
there is no limit to our abilities 1
there is much to discover 0
he discovered that the earth is... 0
...take interest in doing... 1
i am not interested in... 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@alexnajm with these complex text queries, RegEx is your friend
Stealing @HomesickSurfer's examples, we can use the "\b" word boundary token to search specifically for these words (separating them out with pipes as ors).
This filter should work for you:
RegEx_Match([VERBATIM],'.*(\blimit\b|\bdiscover\b|\binterest\b).*')
Hope that helps,
Ollie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@OllieClarke Love it. Works. I'm coming straight to you for this regex stuff I don't understand much about!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@kayla_o do you have access to the intelligence suite? You can do this through the text pre processing tool