Alteryx Designer Desktop Discussions

brianfia · ‎06-24-2022

I am creating a workflow that will match up two transactional files on Account Id and Amount, then identify if the transaction was found in both (Duplicate) or not. So far I am using the Join Tool to 'join' the two tables on Account Id and Amount, the Left and Right out going anchors are giving me the truly unique records that I need, but the Join Anchor is where my problem is arising.

The Join Anchor is giving me records where the account is truly in both tables and thus a "duplicate", but I am also getting records where the account is in one table with 2 rows of the same amount and in the other table with a single row of that same amount.

Think something like this:

Table #1

V_ID	V_Act	V_Amt
1	123	100
2	123	100
3	234	50

Table #2

L_ID	L_Act	L_Amt
7	123	100
8	234	50
9	234	50

After Join

V_ID	V_Act	V_Amt	L_ID	L_Act	L_Amt
1	123	100	7	123	100
2	123	100	7	123	100
3	234	50	8	234	50
3	234	50	9	234	50

I have created unique row ids for each record to identify if the row is being duplicated from the join tool, below is an example of the type of results my Join Anchor is providing.

V_Str	L_Str
97	544
98	544
149	316
156	316
221	94
224	94
521	64
521	65
522	64
522	65
561	858
562	858
615	106
616	106
622	257
622	258
757	25

As you can see the L_Str row 544 is being joined to the V_Str rows 97 and 98. In this case V_Str row 97 and L_Str row 544 are consider a "duplicate" and V_Str row 98 would need to be kept to continue through the process.

In another case V_Str row 622 is being joined with L_Str rows 257 and 258. In this case L_Str row 257 and V_Str row 622 are consider a "duplicate" and L_Str row 258 would need to be kept to continue through the process.

There are also cases like V_Str rows 521 and 522 which are both joining individually to L_Str rows 64 and 65. In this case both V_Str rows 521 and 522 AND L_Str rows 64 and 65 are considered 'duplicate'.

Finally in a case like V_Str row 757, this record would be a straight forward duplicate so V_Str 757 and L_Str 25 should be considered "duplicate".

I am hoping that someone can come up with a good solution for this problem.

TIA

AndrewSu · ‎06-24-2022

@brianfia , I may be understanding the problem incorrectly, but why not just union the two tables and then do a unique tool based on Account and Amount to find the values? see screenshot and attached workflow.

brianfia · ‎07-07-2022

I may not have explained it well enough. Basically we are getting transactions from two different systems, A and B, but only transactions in A are fulfilled by A while System B contains some but not all the same transactions as system A and also contains unique transactions to system B.

Another way to think of it is:

System A fulfills orders and keeps a list of the transactions that it fulfills.

System B sends transactions to A to fulfill but also fulfills orders on its own. The transactions that it sends to A are the ones that I need to eliminate, but there are no identifiers that state these were sent to A.

I need to get a list of all orders fulfilled

Certain transactions can reside in both systems and other than account id and dollar amount of the transaction, there are no other unique fields.

** I am working to get more uniquely identifiable fields, but in the interim I need to find a solution for what I am currently given.

Amol_Telore · ‎07-07-2022

This solution will exclude those transactions which are present in both the systems and will only consider unique transaction from each of them. Hope this answers your query.

leozhang2work · ‎07-07-2022

Take your system A as the left join table, system B as the right. Then L and R anchor should give you distinct orders.

You problem is handle the inner join. Since only B sends to A, if there is a match, then any matching ID from B will give you a list of things you can drop from B table.

Alteryx Designer Desktop Discussions

Identifying and Removing Duplicate Transactions

Re: Is there any way the computer vision tools can...

Re: Batch Macro

Re: How to get cell reference address from excel

Re: Replacing Forecast columns with Actual Data

Re: Row creation