Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Python code in Alteryx

Niru
8 - Asteroid

Hello All,

 

kindly help me to understand below two statements and how to impliment in Alteryx. I am not sure what exactly below two statements means. Kindly help how should i implement these two function in Atleryx. 

 

df6=df6[df6['_merge']=='left_only']

df7.drop_duplicates(subset=['ECI'],keep='first',inplace=True)

 

Regards,

Niru

7 REPLIES 7
Thableaus
17 - Castor
17 - Castor

Hi @Niru 

 

How does your data looks like? And the rest of the code?


From what I understand with df6 you just want to keep the "L" output on the join you're doing with two dataframes. Basically everything that didn't join on the left side.

 

The df7 dataframe is basically dropping duplicates on the ECI field, which means you need to use a Unique Tool checking the ECI field. The Unique tool already keeps the first occurence on the "U" output. 

 

Cheers,

Niru
8 - Asteroid

Hi @Thableaus Thank you for your reply

 

Kindly help me, I do not find the any tool in Alteryx to get the Left merge information. 

 

df6=df6[df6['_merge']=='left_only']

 

Regards,

Niru 

atcodedog05
22 - Nova
22 - Nova

Hi @Niru 

 

Assuming this is python pandas dataframes. 

 

df6=df6[df6['_merge']=='left_only']

Basically means only keep data where column ['_merge'] values ='left_only'. This is a filter action you can use ['_merge']=='left_only' in filter thats the equivalent action.

 

df7.drop_duplicates(subset=['ECI'],keep='first',inplace=True)

Means remove duplicates with takes ECI as key and keep='first' means keep first row of  each unique ECI. Using unique tool with key set as ECI will do the same action.

 

Input:

ECI_merge
1left_only
1left_only
1 
2left_only
2 
3left_only
3left_only
3 
4 
4 

 

Workflow:

atcodedog05_0-1622708922127.png

 

Hope this helps 🙂

 

Niru
8 - Asteroid

Hi @atcodedog05 Thanks for your quick reply,

 

I am trying to implement python code to Alteryx, Here python able to produce output in 10 minutes for one ID, Alteryx is running from more than 1hour. Any troubleshooting or how can i fix this issue.

 

Query.PNG

Regards,

Niru

atcodedog05
22 - Nova
22 - Nova

Hi @Niru 

 

That cant be right as per my experience Alteryx is faster than python and Alteryx In-Db workflow is much more faster than normal workflow. Since In-Db does all the calculation on the database system itself.

 

You might want to check output of both python & alteryx. There is a possibility that somewhere in one of join tool there is one to many joins happening and impacting your workflow.

 

Please check the output and number of records outputted from both systems.

Niru
8 - Asteroid

Hi @atcodedog05 Im able to run query in SQL server 6minutes ,But same query when i run from Alteryx using IN-DB tools

its running more than 30minutes. Any troubleshooting please

 

Regards,

Niru 

 

atcodedog05
22 - Nova
22 - Nova

Hi @Niru 

 

This is something which we cannot help without looking into the workflow and since In-Db workflow others outside your company like us wont have access to look into the data stream. Best you can do is reach out to Virtual Solution Center. I dont know how it works but they might be able to help you.

 

https://community.alteryx.com/t5/Virtual-Solution-Center/tkb-p/vsc

 

Hope this helps 🙂

Labels