Hi All:
I have an invoice and payment file (see example below for your reference). My end goal is to find if there was a duplicate invoice #, duplicate vendor # or duplicate payment were made throughput the year.
My issue in that the file list all invoices on a header level. What does this mean is that same invoice number appears multiple time due to the fact that the invoice was breakdown by line by line item (different products purchased within the same invoice number) (see below table for your reference):
Thank you in advance.
Vendor Name | Vendor # | Invoice number | INVOICE DATE | INVOICE_AMOUNT | Payment amount | Line type |
A | 200073 | 50103116 | 2020-11-30 | 100 | 100 | L1 |
A | 200073 | 50103116 | 2020-11-30 | 100 | 100 | L2 |
A | 200073 | 50103116 | 2020-11-30 | 100 | 100 | L3 |
B | 15 | 20 | 2020-5-25 | 200 | 200 | L1 |
B | 15 | 20 | 2020-5-25 | 200 | 200 | L2 |
B | 15 | 20 | 2020-5-25 | 200 | 200 | L3 |
Hi @Hussein982
If you use a summarize tool first to group your data together, you can combine line items that all belong to the same invoice. From there, you can use a unique tool to check for duplicates.
@Kenda Thank you Kenda for your quick response. I did follow it and it worked but as you can see below the invoice number 001/2020 appears multiple times but for different suppliers (so false positive).
How can I get rid of these so to pick real duplicate invoices if applicable/any?
Hey @Hussein982
In the summarize tool, when you "Group By" a field, Alteryx will combine the records with identical values in that field into a single record. Therefore, by grouping by multiple fields, each unique combination of those fields will result in a separate row.
In your case, if you don't want the level of detail where it's creating multiple rows that you don't want to see, do not include that field in your summarize tool.
For example, if this is my input data:
Group | Sub Group | Field3 |
A | a | 1 |
A | a | 1 |
A | b | 1 |
B | b | 1 |
B | b | 1 |
Here is how the data will look if I group by both the Group and Sub Group fields:
Group | Sub Group | Sum_Field3 |
A | a | 2 |
A | b | 1 |
B | b | 2 |
Here is how the data will look if I only group by the Group field:
Group | Sum_Field3 |
A | 3 |
B | 2 |
And here is how the data will look if I only group by the Sub Group field:
Sub Group | Sum_Field3 |
a | 2 |
b | 3 |
Hopefully this provides you with enough information to apply to your problem here.
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |