Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identifying and counting occurrence of a string

ries9112
7 - Meteor

Hey, this is my first post to the forum and I haven't reviewed the rules too thoroughly, so I hope my post does not violate any. I have a dataset (pictured below) with 100+categories that just have letters. I would like to be able to identify strings that appear less than 5 times and categorize rows that contain that value as "Rare". I can't use a summarize node with a group by and a count, because I would have to add all 100+ of them and the count would just count the exact match of the combination of letters instead of the individual letter. Any thoughts or ideas?

alteryxproblem.png

7 REPLIES 7
MarqueeCrew
20 - Arcturus
20 - Arcturus
How about transpose the data with a group by ID. Then summarize tool and group by ID and concatenate with no concatenation specified. Then you can summarize on the new value and count the occurrences. Filter where count is less than 5 and then join back to the data to find the IDs.

Clear as mud?
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
simon
11 - Bolide

try the field summary tool and run all your fields through it. It should generate a nice meta data table and interactive report.

 

 

ries9112
7 - Meteor

Thank you for your answer! The transpose and first summarize worked perfect. I am, however, a bit confused on the second summarize. Could you explain what you mean by "Then you can summarize on the new value and count the occurences"? So now that I have the two columns "name" which includes all the categories and the "concat_value", what do I do in the second summarize? I tried grouping it by the concat_value and counting the concat_value but I'm running in the same problem as before where it's counting the exact match instead of the letter. 

 

Apologies on my lack of knowledge, I'm a student taking a class that involves alteryx, so by no means an expert.

ries9112
7 - Meteor

Simon, thanks for your reply! Again, I'm not an expert so I could be wrong, but wouldn't the field summary give me a summary of each column? I would like to know how often the string occurs in the data as a whole instead of in individual columns, and finding that out by looking at the report for each one seems challenging. I just ran it, and I can only seem to find unique values for each column as well.

MarqueeCrew
20 - Arcturus
20 - Arcturus
If you grouped by id, you should have id, name, value. Group by id before you concatenate.

Summarize with group by concatenate-value and then add a count. When you find rare occurrences, you can join back to id, concat-value to find the ids to flag.

If I were near a computer I would show you what I'm thinking of more clearly. Thanks for working with me to help you.

Sent from my iPhone
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Joe_Mako
12 - Quasar

Attached is an example of what Mark described.

 

Find Rare.png

ries9112
7 - Meteor

Amazing!! Thank you so much for all your help!!! It worked like a charm Smiley Very Happy

Labels