(I'm going to find it hard to upload sample data - it's proprietary and large - but will try if no-one can make sense of the below.)
+ I have data - 779k records. Total value: £13m.
+ I summarise - 55k records. Total value: £13m.
+ I run a JOIN. Input:
LEFT: My 55k records
RIGHT: A reference sheet, in which I'm pulling in only BARCODE and PRODUCT TITLE, joining on BARCODE and pulling in the PRODUCT TITLE on a match.
+ The join output:
RIGHT: Irrelevant
LEFT: 290 records, SUMMARISE value £8k
JOIN: 68,618 records, SUMMARISE value £15.5m(!!)
I simply do not understand how this happened. All the JOIN should be doing is pulling in a text value if it matches barcodes. How is it increasing the number of records and the value in the db?
The Join is reading off an Excel - my current theory is that perhaps that reference excel lists the some barcodes twice but with a different text entry for each listing. Would the JOIN then duplicate the record it's matching to in the non-reference document, so it can put a match one with each of the found text entries? (I have checked and this is not the case)
Solved! Go to Solution.
Hi @pupmup
What's happening is that you have more than one matching records per concept in one (or both) datasources.
Remember that Alteryx returns Joined records through the J output, and Unjoined ones through the L & R outputs.
Maybe if you summarize(Group by) or Unique the BARCODE for the R input, and then try the join, maybe you'll get close to what you need.
Regards
simple example but hope it helps
suppose you have one row on the left value 100 and it has barcode "abc"
on the right you have 10 rows, 3 of which have barcode "abc"
the join on barcode "abc" will produce 3 rows, each having value 100
is there additional field(s) that can be added to the join to make it unique ?
Hi @ChillEDog,
For what I understand of your example, the structure of your data is something like this:
What I would do is to Summarize the INVENTORY table, grouping by [barcode] and SUM([units]). That'll give you a unique value for [barcode] with the total of values summed.
It's result is:
Then you can join against your other table.
If I misinterpreted anything of your use case, please let me know, and we'll see it.
Thank you. I was already summarising the Inventory by barcode at an earlier stage in the process - the duplication only went away when I added a Unique process to the retail price table, based off barcode.