Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Count identicial fields and then unique across all?

jbradshaw
6 - Meteoroid
I have a bunch of records that belong to groups, with a field assigning the group.  I.e. field name is "Group Name" and then each record has something like "Group 1" or "Group 2" and so on.  Record 1, lets ay "John Doe" may be in the whole batch several times in multiple groups and I need it deduplicated so ultimately he is only in there once listed with one group name.

The tricky part, I want to first figure out how many records each group contains.  I want john doe to stay in the smallest group, so that his multiples are pulled out of the larger groups.  This should help ensure the groups that start small are not weeded down to nothing after the dupes are pulled out.

Any ideas?
1 REPLY 1
kane_glendenning
10 - Fireball

Hi Jeffrey,

I must admit, I got a little carried away on my lunch break and so this may be able to be done easier, but one way is to create a little iterative macro. The Macro would have the following:

  • Summarize: Group By "Group", Count "Group"
  • Join: On Group back to original data
  • ONLY Unique (CReW Macro, downloadable from Adam's blog): On Name
  • Sample as many as you want per run. This is to make sure that the biggest groups aren't decimated
  • Multi-Row: Group By Name, Rows that don't exist are Null
  • Filter: Keep field
  • Union: Unique Output (Only Unique Tool), Keep Output (Filter Tool), The non-sampled data
  • Select: Remove the Count & Keep fields
  • The I output gets re-iterated (via the configs in the Interface Window) and the O output will have your de-duped list

If you didn't want to create a Macro, then you could just build this set of tools once and then keep copying & pasting it, however you wouldn't want to do many pass throughs if you're doing it that way. Because this just takes the group that has the lowest members for each person, you have to take into account that those group numbers change as people are removed and so I suggest batches of 20-50 (The Numeric Up/Down on the Macro).
 

rtaImage (8).png


Let me know if you have any questions about this..

Kane

Labels