Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Multiple Join gives duplicate records

AnandKumar1
8 - Asteroid

Hi,

 

I'm trying to join 2 different files based on some condition .

This is my 1st file data

FundSecurity
1A89236THM0
2B05253JAU5
3CUSD

 

This is my 2nd file data

FundISINCusipBBSEC
1A49456BAM389236THM089236THM0
2B05253JAU528404Hk424290424244
3CUSD2404JL20880USD8392040848

 

I've used 3 Join tool 

1) Fund=Fund and Security = ISIN

2) Fund= Fund and Security= Cusip

3) Fund= Fund and Security= BBSEC

 

Now the problem is some value in present in both of security is present in both cusip and bbsec . So same record is coming in 2 join output which is actually duplicate. Like Security= 89236THM0  is present in Cusip & BBsec of second file.

 

How to handle 

 

 

AnandKumar1_0-1627644255581.png

 

11 REPLIES 11
atcodedog05
22 - Nova
22 - Nova

Hi @AnandKumar1 

 

You can unique tool to remove duplicates. If rows are exact duplicate you can select all columns as key columns.

https://help.alteryx.com/20212/designer/unique-tool

 

 

atcodedog05
22 - Nova
22 - Nova

Hi @AnandKumar1 ,

 

Can you share a sample workflow till union tool so that we can see the join tool configuration and build/suggest accorindginly.

HomesickSurfer
12 - Quasar

Hi @AnandKumar1 

 

What are you attempting to accomplish with a workflow?

What is your desired output?

AnandKumar1
8 - Asteroid

@HomesickSurfer 

This is my problem statement

I've many files in my directory which i'm trying to join based on a condition. I've some files in one folder which has basic structure like below.

Some files has only Sec_Id or isin or sip. Some files has all 3 fields.

Fundcusipisinbbsec
123637ABCUS4971VA744573AP1
124US2130US27CM6040WAW
567IN3133US185BA8894AK5
89013TKJGUS56VAM2040WW5

 

Then i've few files in different folders which has structure like below

FundSecurity
123744573AP1
124US2130
567US185BA8
890

040WW5

 

So i wanted to join these files based on a condition 1) that if the Security of 2nd files exist in any of cusip or isin or bbsec from 1st files. 

Some security values are present in cusip and isin both. Some are present in all 3 . But it will be there in at least one of these 3 fields.

 

Now i've applied 2 solution first with find -replace which is working fine for single file but when i use multiple files i'm getting lot of duplicate

2nd solution is joins but this is also giving almost double records.

 

HomesickSurfer
12 - Quasar

Hi @AnandKumar1 

 

Would this work?

 

Capture2.PNGCapture1.PNG

atcodedog05
22 - Nova
22 - Nova

Hi @AnandKumar1 

 

Just a possibility can you check whether the duplicating is happening when you are reading the file.

AnandKumar1
8 - Asteroid

@atcodedog05 Duplicates are happending only after Join.

Actually there are some records which are common in many files. I think a cartesian join is happening here

atcodedog05
22 - Nova
22 - Nova

Hi @AnandKumar1 

 

Then clearing duplicates before head and using find & replace approach after it should help in your usecase🙂

Labels