Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Text Mining Alternative for Contains

kayla_o
7 - Meteor

Hi All,

 

I am using a formula tool  that has various topics that I'd like to tag (example below)

 

 

Column 1

(if contains (if Contains([VERBATIM],"limit")
or Contains([VERBATIM],"discover")
or Contains([VERBATIM], "interest") then 1 else 0 endif

 

The problem is, the output is giving me 1's for words like "discovered" and "limited" but I only need the exact words. 

I also tried the find and replace tool, but that doesnt work for me either as it only seems to pick up the first word in the cell, and not the exhaustive list of keywords in "column 1"

 

Any suggestions would be greatly appreciated!

7 REPLIES 7
alexnajm
17 - Castor
17 - Castor

Can you simply do:

if [VERBATIM]="limit"
or [VERBATIM]="discover"
or [VERBATIM]="interest" then 1 else 0 endif

kayla_o
7 - Meteor

That does not work for my situation as my field [Verbatim] contains an entire sentence, so doing it that method only returns the records that have the exact match in the [Verbatim] column. 

alexnajm
17 - Castor
17 - Castor

Understood - then try adding spaces around your words to isolate them:

 

(if contains (if Contains([VERBATIM]," limit ")
or Contains([VERBATIM], "discover ")
or Contains([VERBATIM], " interest ") then 1 else 0 endif

HomesickSurfer
12 - Quasar

Hi @alexnajm 

 

I would use same...though cautiously.  I've done so before and had misleading results as doing so will not identify the isolated word if it's the first or last word, preceded or superseded with a comma, period or other character.  If a use case allows for some exception or the data is unlikely to have the words in such way, fine...otherwise consideration should be given to use of Regex, reverse string, etc.

 

VERBATIM Match
we are not limited 0
there is no limit to our abilities 1
there is much to discover 0
he discovered that the earth is... 0
...take interest in doing... 1
i am not interested in... 0

OllieClarke
15 - Aurora
15 - Aurora

@alexnajm with these complex text queries, RegEx is your friend

Stealing @HomesickSurfer's examples, we can use the "\b" word boundary token to search specifically for these words (separating them out with pipes as ors).

 

This filter should work for you:

RegEx_Match([VERBATIM],'.*(\blimit\b|\bdiscover\b|\binterest\b).*')

 

 

image.png

 

Hope that helps,

 

Ollie

HomesickSurfer
12 - Quasar

@OllieClarke Love it.  Works. I'm coming straight to you for this regex stuff I don't understand much about!

aatalai
14 - Magnetar

@kayla_o do you have access to the intelligence suite? You can do this through the text pre processing tool

Labels