Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Join of duplicates

nicolas567893
5 - Atom

Dear all,

 

I have a challenge that you may help me to resolve.

 

I want to link two files with the "Join" tool. The problem is that there are duplicates...

 

The left source file (1 column) shows how many IDs were processed, in this case three times the ID "123456". The available information is limited to the "ID", i.e. I cannot distinguish the individual lines from each other - there are several duplicates:

 

ID

123456

123456

123456

 

 

The right source file (two columns) shows a list of IDs with the respective gift packaging fee. In the right file, the ID "123456" is only listed twice. This means that two shipments with the ID "123456" were gift wrapped and one shipment was not. Therefore, the ID only appears twice instead of three times:

 

ID

Gift Wrap Fee

123456

5

123456

5

 

 

The desired result looks like this:

 

ID

Gift Wrap Fee

123456

5

123456

5

123456

[Null]

 

 

However, if I use a "Join" tool, the following table results (see below). It is clear to me that it is because of the duplicates.

 

ID

Gift Wrap Fee

123456

5

123456

5

123456

5

123456

5

123456

5

123456

5

 

 

I "solved" the problem by switching numerous "Unique" and "Join" tools one after the other. However, there is a danger that I have not built in enough "loops" if, for example, an ID appears 100 times...

 

Do you have a solution / idea for me?

 

Many thanks in advance for your help!

 

Kind regards
Nicolas

2 REPLIES 2
JosephSerpis
17 - Castor
17 - Castor

Hi @nicolas567893 what I do when I know there are duplicates in my data when i'm joining is I create a rank based upon that field using the multi row tool. I've mocked up a workflow let me know what you think? 

 

Duplicates_260819.PNG

Cal_A
7 - Meteor

Can't go through a proper solution right now.  But you could try numbering each instance of each ID.  In this way you create a unique reference (in each table) than can be used for the join.  Once you have this, the rest is easy.  A few different ways to create the instance numbering e.g. running total, but there must be a more elegant way.

 

IDinstance  IDinstanceGiftwrap?
65461  65461$5
65462  65462$5
65463  1231231$5
1231231 vs1231232$5
1231232  1231233$5
1231233     
1231234    

 

Labels