Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Tagging multiple keywords group by different projects

knozawa
11 - Bolide

Hello,

 

I have a workflow that tags multiple keywords to content grouped by different projects.  This is done by a batch macro.  The workflow works fine when we tag by a small number of keywords.  However, when we increase the number of keywords, the workflow runs long time and sometimes gives us "unexpected error".

 

Is there a better way to tag multiple keywords so that the workflow process faster and more efficient?

 

Attached a sample workflow that contains a batch macro.

 

Input 1 (contents)

contents.png

Input 2 (keywords)

keywords.png

Desired output

desired output.png

Sincerely,

knozawa

 

7 REPLIES 7
ponraj
13 - Pulsar

Used iterative macro to achieve the desired result.  

 

Capture1.PNGCapture.PNG

kelly_gilbert
13 - Pulsar

Hi, @knozawa - are your keywords always one word (such as Alteryx or Banana), or can they be phrases (multiple words)?

 

If they are always one word, you could parse the [content] into words, and then join them to your keywords (rather than using the macro). I randomly created 250,000 content phrases and 100,000 keywords, and this method ran in 25 seconds on my laptop. This won't work if your keywords could be multiple words, though.

 

kelly_gilbert_0-1580537266872.png

knozawa
11 - Bolide

@ponraj ,

 

Thank you for your suggestion. However, I wonder if Iterative macro performs faster than batch macro.  Also, if I can run filter by keywords for specific groups only (my desired output shows tagging multiple keywords grouped by specific projects). 

 

Sincerely,

knozawa

 

knozawa
11 - Bolide

@kelly_gilbert ,

 

Thank you for your suggestion.  Yes indeed, your method is much faster than using macro.  However, as you mentioned, I have some phrases to filter by.  I wonder if there is a way to achieve similar performance to filter by phrases. 

 

Sincerely,

knozawa

knozawa
11 - Bolide

I've attached the solution workflow. 

 

Thank you all, especially @DanielBr from Alteryx support engineer helped me to solve this issue.

 

Batch macro vs regex with find replace methods:

Regex with find replace method was 1148 times faster performance because it processed only one time looking up keywords instead of looping multiple times.  Also, regex with find replace method had more matches because some of the contents did not have spaces between words due to Japanese language.

 

Two take-away for using regex with find replace method:

1. When contents contain keywords, only the longest keyword matches.

 

Contents = John Aaron Smith

Keyword 1 = John

Keyword 2 = John Aaron

Keyword 3 = John Aaron Smith

 

2. Potential mismatch for some keywords (i.e. "app" matches with "appliances" and "applications" not only "app")

 

As a result:

Batch macro method is suitable when:
1. Multiple keywords should be tagged for the same contents match (i.e. keyword 1, 2, and 3 should be tagged for contents = "John Aaron Smith")

2. List of keywords contains "short-length" keywords (i.e. "app")

 

Regex with find replace method is suitable when:

1. There are many keywords in the list to match

2. Words are not separated by spaces (i.e. Japanese/Chinese)

 

Hope this helps for people who have similar use cases.

 

Sincerely,

knozawa

 

TomWelgemoed
12 - Quasar

Hi @knozawa ,

 

Thanks for posting the approach you used! Also, I believe @MarqueeCrew is working on a new macro that helps with this.

 

@MarqueeCrew , can you jump in - hopefully I haven't got the wrong end of the stick?

 

Best,

Tom

kelly_gilbert
13 - Pulsar

@knozawa - thanks so much for reporting back with your solution (and providing the details on when each approach may/may not work for different scenarios)!

Labels