Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

remove some dupes but not all

kpmg_lc_analyst
7 - Meteor

I have a situation where I need to remove some duplicates from a dataset. For example, i need to remove 4 of the 8 records that have the same record ID.  Is there a better way to do this than simply filtering to the record ID and using the sample tool? I'm trying to figure out the best way to do this.  Obviously the unique tool won't work. Thanks in advance. 

 

record idcount should be count is
124
248
348
4816
4 REPLIES 4
NicoleJohnson
ACE Emeritus
ACE Emeritus

Hmmm. Could you add a second level "RecordID2" (using a multi-row tool) to add a RecordID within each RecordID grouping... then join the desired count by original RecordID... then filter out any "RecordID2" that is greater than the desired count?

 

RecordIDRecordID2DesiredCountIncluded when filtering for RecordID2 <= DesiredCount?

1

12Yes
122Yes
132No
142No
214Yes
224Yes
234Yes
244Yes
254No
264No
274No
284No
Joe_Mako
12 - Quasar

How about an iterative macro? This will effectively divide the records returned by half, and keep the remainder. For example:

 

1 record, 1 record returned

2 records, 1 record returned

3 records, 2 records returned

4 records, 2 records returned

5 records, 3 records returned

6 records, 3 records returned

etc...

 

Is this the logic you are looking for, or is it something else?

 

Filter Unique.png

MarqueeCrew
20 - Arcturus
20 - Arcturus

@kpmg_lc_analyst,

 

Like @NicoleJohnson suggested, here is a workflow that joins the data to a table (to assign max rows per record id).  It then counts rows and keeps only the needed dupes.

 

Capture.png

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
William_Urciuoli
5 - Atom

Tried out the sample and it works great, I altered the multirow formula to "IIF([RecordID]=[Row-1:RecordID], [Row-1:RowNum]+1, 1)" and all expected can be displayed. 

Labels
Top Solution Authors