Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identifying duplicates help

summarizer
9 - Comet

Let's say I have this for an example:

 

Item description      Invoice #

5 lbs Rice                     101

5 lbs Rice                     101

5 lbs Rice                     101

5 lbs Rice                     201

5 lbs Rice                     201

5 lbs Rice                     201

 

In our data, the first 3 rows that say "5 lbs Rice" from Invoice 101 are not necessarily duplicates because they could be white rice, brown rice or yellow rice.  But the next time an invoice comes in (#201), that likely IS a duplicate and we want that invoice marked as a dup for further review.  I'm still relatively new to Alteryx and I've tried grouping, uniquing, summarizing and I haven't come up with the proper combination yet to figure this out.  As I'm currently doing it, Alteryx is marking lines 2, 3, 5, and 6 as duplicates which isn't the case. 

 

Any ideas?  Thanks!

8 REPLIES 8
MarqueeCrew
20 - Arcturus
20 - Arcturus

@summarizer,

 

My first thought is to Summarize using Group by Invoice # and concatenate item description

 

101 -> 5 lbs Rice, 5 lbs Rice, 5 lbs Rice

201 -> 5 lbs Rice, 5 lbs Rice, 5 lbs Rice

 

now you can unique on the Concatenated Field and lookup the duplicates.

 

This would find invoices with the exact same content (order too).  Of course it would also find any order of a single item many times.  I guess seeing some real data would help.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Rohit_Bajaj
9 - Comet
Hi,

I am not able to understand one thing might be a basic one -
Why there cannot be 2 invoice numbers having same line items.
1 customer can purchase same thing twice or 2 customers can purchase the same thing hence generating 2 invoice number but same line item.

Thanks,
Rohit Bajaj
summarizer
9 - Comet

Thank you.  I will try that and let you know... or post up what new challenge I have found. :-)


Appreciate the quick response! 

summarizer
9 - Comet

Rohit, I gave that example to try to simplify the kinds of duplicates I'm seeing in my data, so your question is valid, but my situation is actually medical billing data which sometimes gets billed again at a later date.  Marquee Crew's answer actually did work for me, so I'm all good now.  Thanks for the response! 

summarizer
9 - Comet

Thank you MarqueeCrew.  That worked.  The only additional thing I had to do was sort ascending by the invoice number to make sure 101 was okay, but 102, 103 etc would we flagged as a possible dup.

 

workflow pic for disc board.jpg

MarqueeCrew
20 - Arcturus
20 - Arcturus
Ps. Depending upon the number of services, you might need more than 254 bytes.
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
summarizer
9 - Comet

Good point!  Thanks!

summarizer
9 - Comet

Actually, I was wrong.  It is still treating my invoice 101 as if the 2nd and 3rd Rice items are dups when I actually only want invoice 201 to be dups.  I think this might need a multi-row formula? 

 

Plus, if I start out with 6 rows, I need to have 6 in my output as well.

Labels