Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Search for Keywords in a Text String

DanC
Moderator
Moderator
Created

This article features a workflow that will allow you to search for keywords within a column of text strings. Potential applications for this might be to scan a Twitter feed, customer product feedback or Facebook comments for keywords such as the mention of particular people, descriptive words or subject matters. For this example, the purpose will be to search famous science quotes for a select group of keywords.

In the first section of the workflow, we are performing the following functions:

  1. Reading in the data - The first table contains the text strings to be searched (quotes); the second table contains the key words that we want to search for.
  2. Adding a Record Id in order to join the original text strings back into the data.
  3. Using the Text to Columns tool, we will split the text strings into individual rows for each word within each text string.
  4. Append each of the key words to each of the records/words resulting from the previous step.

Section_1.png

In the next section of the workflow, we will:

  1. Use the Multi-Field Formula tool to change all words in both the text strings and keywords to uppercase as the FindString function used downstream is case sensitive.
  2. Use the RegEx tool to remove non-word characters from the text string words (such as periods, commas, etc.) so that we can do a double-check on the FindString by using a length test. This will prevent the FindString function from returning a match of the keyword 'know' to the text string 'knowledge'.
  3. Match words using the Formula tool and the FindString function in conjunction with comparing the length of the keywords to the text string words.

Section_2.png

In the final section of the workflow, we will complete the steps of:

  1. Summarizing the counts of each keyword to each Record ID of the text strings.
  2. Joining back in the original text strings.
  3. Sorting the data back to its original state.
  4. Transposing the data in order to perform a cross-tab count of the keywords.
  5. Performing a final summary count of the keywords.

Section_3.png

Things to consider:

  • Your text strings might contain non-word characters not accounted for in the RegEx in this example. Make adjustments to the RegEx as appropriate.
  • The workflow results can be used to make a Word Cloud

As with almost anything Alteryx, there is more than one way to do things. Try some variations and post anything you think is worthwhile!

Attachments
Comments
lcox777
5 - Atom

Great workflow, thanks!

Ywtiof
5 - Atom

Great article! Could you also tell us how to do word search within multiple columns? This one searches words within one column. Other thing in my mind - how to get an output where all records (one row multiple columns) will be displayed if the word is found in at least one column within one row? Thanks.

RodLight
8 - Asteroid

@Ywtiof 

Can't attach the sample workflow I built, but here's the workflow screenshot.

MultiColum Search.png

 

The conceptual logic behind many multi-column processes is to Transform them into a single column, do the process, and then join back up to the original data.

Hope this make sense. 

Rod

AlanE
5 - Atom

Great template for this process! This helped me solve a long-standing question for an associate, and made me a hero! DanC is MY hero!

Lily2019
5 - Atom

Great article, thanks!

 

I have a quick question about the join tool. I re-run the workflow by removing the join and sort tools and got the same result for the output. Could you please also explain a bit more why we need to join the original input file?

 

Thank you,

dalarconmood
5 - Atom

Hello,

 

This is really cool, did you created a macro with this?

 

If so would you share it?

 

Thanks

 

Diego

Ragini1
8 - Asteroid

The current workflow in helpful in searching one word. If in Keyword for search input tool, I want to two words like "science technology". How to do that? Suggestions will be highly helpful.

hellyars
13 - Pulsar

@DanC  This is very useful.  What do you do if your key words are key phrases, such as "machine learning" or country names, such as "United Kingdom", or "United Arab Emirates"?