Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Finding Duplicate Transactions (with a couple catches)

JordyMicheal
11 - Bolide

Hey Alteryx,


I've hit a little wall so I'm reaching out for help.

I'm parsing and prepping some data for a client, who is testing some duplicate billing.

 

I have a list of around 22k transactions with Project Invoices, Billed amounts, Cost Codes, Dates, Billing Name, Labor Type, and Hours billed. (few other fields if needed).

 

The only logical way I can think of doing it with tools, would be to sort the data by Master Name, Document Date, and Billed Amount;

Then run and if else as such:

 

IF
([billed amount] == [row+1:billed amount]
AND [document date] == [row+1:document date]
AND [cost code description] == [row+1:cost code description]
AND [master name] == [row+1:master name]
AND [Labour type] == [row+1:Labour Type]
AND [Qty] = [row+1:Qty])

THEN "YES"
ELSE "NO"
ENDIF

 

This will match all the billed amounts, BUT WAIT!!! There are bill backs also. IE They bill 5 hours then -5 hours after. So I added a case of:

 

OR

([billed amount] == ([row+1:billed amount] * -1)
AND [document date] == [row+1:document date]
AND [cost code description] == [row+1:cost code description]
AND [master name] == [row+1:master name]
AND [Labour type] == [row+1:Labour Type]
AND [Qty] = ([row+1:Qty] * -1))

 

 

Right now I feel a little far down the rabbit hole and wanted to take a step back to see if anyone would do this any differently.

SAMPLE TAKEN DOWN

4 REPLIES 4
BenMoss
ACE Emeritus
ACE Emeritus

Could you post some sample data with the complexities mentioned and then your desired result?

 

Ben

JordyMicheal
11 - Bolide

Uploaded a sample with modified data but same fields :)

BenMoss
ACE Emeritus
ACE Emeritus

Can you post what the desired result would be here too?

 

To me I think there is a relatively simple solution. Absolute any negative values (this can be done using a formula tool with the function abs([field]) and then simply perform a unique on the fields you have mentioned. Your U stream will then contain at least one of every record. The D stream will contain any subsequent duplicates.

 

Does that work for you?

 

Ben

JordyMicheal
11 - Bolide

Nail on the head Ben, even when I don't tell you what I want. 

 

I used a formula to abs the values, but then hit google.

I wanted to have all the duplicates in my list, not just the 2nd occurrence (if it appears twice, two records in my list)

Thankfully, some wonderful people out there (I'm going with the publisher being Adam Riley and the Chaos Reigns Within (CReW) macro pack?

http://www.chaosreignswithin.com/2014/08/only-unique-macro.html


Some great tools in there, such as the "Only Unique" tool that I used to generate that list.

Saved our auditors a great deal of time scanning invoices for duplicates with this logic.


THANKS BEN!

 

Labels