Hello All,
kindly help me to understand below two statements and how to impliment in Alteryx. I am not sure what exactly below two statements means. Kindly help how should i implement these two function in Atleryx.
df6=df6[df6['_merge']=='left_only']
df7.drop_duplicates(subset=['ECI'],keep='first',inplace=True)
Regards,
Niru
Solved! Go to Solution.
Hi @Niru
How does your data looks like? And the rest of the code?
From what I understand with df6 you just want to keep the "L" output on the join you're doing with two dataframes. Basically everything that didn't join on the left side.
The df7 dataframe is basically dropping duplicates on the ECI field, which means you need to use a Unique Tool checking the ECI field. The Unique tool already keeps the first occurence on the "U" output.
Cheers,
Hi @Thableaus Thank you for your reply
Kindly help me, I do not find the any tool in Alteryx to get the Left merge information.
df6=df6[df6['_merge']=='left_only']
Regards,
Niru
Hi @Niru
Assuming this is python pandas dataframes.
df6=df6[df6['_merge']=='left_only']
Basically means only keep data where column ['_merge'] values ='left_only'. This is a filter action you can use ['_merge']=='left_only' in filter thats the equivalent action.
df7.drop_duplicates(subset=['ECI'],keep='first',inplace=True)
Means remove duplicates with takes ECI as key and keep='first' means keep first row of each unique ECI. Using unique tool with key set as ECI will do the same action.
Input:
ECI | _merge |
1 | left_only |
1 | left_only |
1 | |
2 | left_only |
2 | |
3 | left_only |
3 | left_only |
3 | |
4 | |
4 |
Workflow:
Hope this helps 🙂
Hi @atcodedog05 Thanks for your quick reply,
I am trying to implement python code to Alteryx, Here python able to produce output in 10 minutes for one ID, Alteryx is running from more than 1hour. Any troubleshooting or how can i fix this issue.
Regards,
Niru
Hi @Niru
That cant be right as per my experience Alteryx is faster than python and Alteryx In-Db workflow is much more faster than normal workflow. Since In-Db does all the calculation on the database system itself.
You might want to check output of both python & alteryx. There is a possibility that somewhere in one of join tool there is one to many joins happening and impacting your workflow.
Please check the output and number of records outputted from both systems.
Hi @atcodedog05 Im able to run query in SQL server 6minutes ,But same query when i run from Alteryx using IN-DB tools
its running more than 30minutes. Any troubleshooting please
Regards,
Niru
Hi @Niru
This is something which we cannot help without looking into the workflow and since In-Db workflow others outside your company like us wont have access to look into the data stream. Best you can do is reach out to Virtual Solution Center. I dont know how it works but they might be able to help you.
https://community.alteryx.com/t5/Virtual-Solution-Center/tkb-p/vsc
Hope this helps 🙂