Hi,
I'm working on an Alteryx workflow and currently need some guidance.
I am trying to identify the knock off entries from a dataset in Alteryx. Where one amount is a positive entry (a debit) and the other amount which offsets it is a negative (a credit). But the positive value & negative vales are not same like -100,000 & +100,000. It will be in split of 100,000.
Knock off Criteria should be with in Same code (Column C) & same Account (Column D).
For Example:
In below data set, there are 2 set of knock off entries with same Code (Column C) & same Account (Column D).
One set of records are highlighted in Yellow & another is in Green. Output is in last (Column F)
For your ref i have attached a excel file with sample data & output comment in last column.
Thank you in advance.
Hi @Khanzali007
I think I have a solution that will work for you. This solution combines all possible combination of numbers in different scenarios, then filters to just the ones that sum up to zero, and finally selects the first scenarios that solve for the problems. This is built grouped by Code and Account, so should be scalable.
Attached is the workflow based on the sample data you provided.
Let me know if this works for you.
Cheers!
Phil
hi @Khanzali007
Just from the descriptions looks like you work in accounting and you are working something that akin to invoice/payment testing.
The solutions that @Maskell_Rascal provided work for small data set where the possible number of combinations is manageable (e.g. 2^n -1 possible combinations if we have n rows) but if you are looking at typical GL and the number of combinations could be too large for a typical laptop to handle.
You may want to ask for an additional data field such as customer ID or invoice to limit the combination counts.
Cheers,
Dawn.
@Maskell_Rascal Thanks for the help. It is working for sample data. Let me check for large set of records, how it will react will update you.
@DawnDuong You're right, I am working in accounting and taxation. There is very large data >1M records. For example: In one Code and account have 60000 records, in which 59800 records will knock off one another. for remaining 200 records i need to work further or intimate to the Business unit. My job is to find 59800 records that it.
@Maskell_Rascal Attaching the one more sample file with 6957 records. In which 6948 records were knocked off. If you have time you may just check once. Thanks in advance.
Hi @Khanzali007
You need a bit of a tweak ... with 2^N-1 combinations and N = 1M records, there is no way your laptop can survive....
I have done a similar task for only 1+ records and it's quite painful. Is it possible to check if you can get any additional data fields to narrow down?
If it's down to brute force... you will need to tweak it such that instead of finding all possible combinations, you do an iterative macro that run down the list and "cache" unique values of the running total and identifies knock offs when the running sum matches one of the "cached" values. once it happens, knock of the instance and re-loop. This method trades of the time but reduces the memory requirement (i.e. don't need to store 2^1M combi).
Dawn.
Dawn.