Hi,
I've many files in my directory which i'm trying to join based on a condition. I've some files in one folder which has basic structure like below.
Some files has only Sec_Id or isin or sip. Some files has all 3 fields.
Fund | Sec_id | isin | sip |
123 | 637ABC | US4971VA | 744573AP1 |
124 | US2130 | US27CM6 | 040WAW |
567 | IN3133 | US185BA8 | 894AK5 |
890 | 13TKJG | US56VAM2 | 040WW5 |
Then i've few files in different folders which has structure like below
Fund | Security |
123 | 744573AP1 |
124 | US2130 |
567 | US185BA8 |
890 | 040WW5 |
So i wanted to join these files based on a condition 1) that if the Security of 2nd files exist in any of Sec_id or isin or sip from 1st files.
2) If in case all Sec_id or isin or sip exist in 1st file, then security should look for exact value and then join. Only one column out of Sec_id or isin or sip will have matching values as of Security in 2nd file.
Is it possible to achieve this?
Solved! Go to Solution.
hi @AnandKumar1 ,
To join both files I reccomned to Transpose first one, then you can join on Fund and Security from 2nd file.
As a Join anchor output you will have joined Fund, Security and Matched key (isin or sip or sec_id) - of cource you can exclude it from output if not needed.
Then I apply Union and added comments if matched or not matched.
I hope it is something you need.
Karolina
Hi @AnandKumar1
This would be my take on the usecase. Here is how you can solve it.
Workflow:
1. On 1st file using formula tool concat [Sec_id] - [isin] - [sip] to a combo key.
2. Using find and replace tool check whether security is present in the combo key. If yes join the row.
Non matchs will be null
Hope this helps 🙂
After using your logic i'm getting lot of records . I think most of it are duplicates.
when i run for one single file its working fine. but when i use all files in directory i'm getting huge number of records.
Any idea why is that?
Hi @AnandKumar1
This method should actually reduce duplicates. Are you using directory on first file or second file.
@atcodedog05 Yes i'm using directory. I've almost 20 files in my directory. When i'm running for one single file, its giving correct output. But problem happens when i use all my files at once
Hi @AnandKumar1
In the above example you have File 1 and File 2 which you are joining. For which connection are you using directory tool is it both?
Hi @AnandKumar1
Can you enable show connection progress and share the snapshot of workflow till and after find & replace tool.
So that we can see rows going in and coming out.
Workflow:
One thing i've noticed . First this find-replace tool should match the exact value . For ex. in my security field i've a value=
06366RJJ5 |
And this value exist in all 3 below fields also which is coming from another directory. But ideally this should look for exact value not any pattern
isin | cusip | bbsec |
US06366RJJ59 | 06366RJJ5 | 06366RJJ5 |
I want to know if this value matched with cusip first, will it still match with next bbsec or it will skip it and produce only 1 output for this match?
Hi @AnandKumar1
Sorry for the late response.
@AnandKumar1 wrote:
I want to know if this value matched with cusip first, will it still match with next bbsec or it will skip it and produce only 1 output for this match?
It will skip and produce only single row as output.
Hope this helps : )