Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to filter dataset with entries that are repeated at least X number of times?

greenv1nes
6 - Meteoroid

Hello all,

 

I have a dataset, and I only want the subset of this data of where there are at least 10 entries of a certain identifier. I used a summarize tool to "group by" my identifier, and count the number of times each of those identifiers showed up. There is about 8,000 unique identifiers, and about 1,300 of these have more than 10 repeats. How can I continue working on JUST that 1,300 subset that has more than 10 repeats?

 

Thank you!

5 REPLIES 5
Luke_C
17 - Castor

Hi @greenv1nes 

 

If you post some sample data I could mock something up, but essentially once you have your group-by and count, filter that to >= 10, then join that back to the original data on the identifier. The J output of the join will have just the identifiers with a count greater than or equal to 10.

 

DawnDuong
13 - Pulsar
13 - Pulsar

Hi @greenv1nes 

one way is to use a join tool.

Left: connect to your original data

Right: connect to (original data grouped by identifier, count occurrences, filter to keep only ID where count >=10)

then then Join output will give you the requires subset.

Dawn

apathetichell
18 - Pollux

Attach a filter first to the summarize stream to filter out those with counts less than 10 and then join your original dataset ids to the ids from your summarize tool.

greenv1nes
6 - Meteoroid

It worked! Thanks so much.

greenv1nes
6 - Meteoroid

It worked! Thanks so much, Dawn.

Labels