Now that it's live, don't forget to accept your certification badge on Credly today! Learn more here.

Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Python Code Tool Script Runner Macro (Code Injection)

DavidM
Alteryx
Alteryx

 

Hi everyone,

 

More and more during my sessions with our customers, I am getting questions about utilizing existing python scripts with Alteryx Designer Python Code tool.

 

This especially whenever developers/ coders/ data scientists use some version of VC/ GIT to manage their Python code repos.

 

How can we automate a load of a py script into the Python Code tool in Alteryx Designer (de-facto inject the code into the tool)?

 

image.png

 

Note: As much as a would like to take all the credit for this one and say I spend many a sleepless night putting this together, this bit was actually created by the engineering team in Alteryx @PetrT 

 

Oh wait - I did find this picture above. And created the macro around it... And tried to be (unsuccessfully) funny about it.

 

Here goes the workflow simple sample of a workflow utilizing that Python Script Runner/ Injector macro:

 

0/ The macro itself has two inputs, one for the file info (F), one for data (D) which is left optional

1/ Take FOLDER where your PY script is saved (could be relative or absolute) and FILE name of your script (PY format)

2/ Take DATA in the tabular format from Alteryx workflow to feed to the script (you can play with the macro to expand to multiple inputs)

3/ OUTPUT of the macro streams out the data from the Python Code tool

 

image.png

 

The macro itself is relatively simple, all that magic really happens in the injection script within that Python code tool

 

image.png

 

The code is here

 

 

# List all non-standard packages to be imported by your 
# script here (only missing packages will be installed)
from ayx import Package
#Package.installPackages(['pandas','numpy'])

 

 

from ayx import Alteryx

df = Alteryx.read("#1")

# Load the params from the input
folder = "" #Placeholder for folder with a script
script_file = "" #Placeholder for file with a script

for index, row in df.iterrows():
    folder = row[0]
    script_file = row[1].replace("\\","/")
    
#print(folder)
#print(script_file)
from ayx import Alteryx
import os
import xml.etree.ElementTree as ET
import re

workflow_dir = os.path.normpath(os.path.normcase(Alteryx.getWorkflowConstant("Engine.WorkflowDirectory")))

try:
    temp_dir = os.path.normpath(os.path.normcase(os.environ['TEMP']))
except Exception as e:
    print('Cannot determine temp dir. ' + str(e))
    temp_dir = None

if workflow_dir is not None and temp_dir is not None:
    if os.path.samefile(workflow_dir, temp_dir):
        is_debug_workflow = True
    else:
        is_debug_workflow = False
else:
    is_debug_workflow = None

if is_debug_workflow:
    debug_workflow_path = os.path.join(temp_dir, 'debug_temp.yxmd')

    try:
        str_xml = open(debug_workflow_path, "r", encoding='utf-16').read()
    except Exception as e:
        print('Cannot parse debug workflow: ' + str(e))
        str_xml = None

    if str_xml:
        workflow_xml = ET.fromstring(str_xml)
        if workflow_xml:
            for element in workflow_xml.iter('Node'):
                try:
                    if element.find('GuiSettings').attrib['Plugin'] == 'AlteryxGuiToolkit.TextBox.TextBox':
                        debug_text_box = element.find('Properties').find('Configuration').find('Text').text
                        original_workflow = debug_text_box[
                                            debug_text_box.find('<Module>') + 8:debug_text_box.find('</Module>')]
                        workflow_dir = re.search(r'(.*)\\(.*)', original_workflow).group(1)
                except Exception:
                    pass
print('workflow_dir: ' + str(workflow_dir))

code_path = os.path.join(workflow_dir, folder, script_file)
exec(open(code_path).read(), globals())

 

 

Actually, all that magic boils down to the last two lines which just simply load the PY script from your file system location using those FOLDER and SCRIPT FILE inputs.

 

Why all the code then? To fix this across situations like running the workflow in DEBUG mode, from TEMP location, etc.

 

What actually happens when you hit run in your DESIGNER? This code just simply finds&reads your PY script and executes it from the Python Code tool.

 

Note: You may wonder how my Iris Script I am running this way looks like, especially to handle the input data and output data connections.

Notice that the macro is built now in such a way to get DATA from Connection #2. And write back is based again just how you define the Alteryx.write() function.

 

 

from ayx import Alteryx

#read the iris dataset
data = Alteryx.read("#2")

#print out input data
data
                    
#import scikit learn kmeans package
from sklearn.cluster import KMeans

#subset data to features used for clustering
clusData = data [['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]

#fit subset feature to 3 clusters
kmeans = KMeans(n_clusters=3, random_state=0).fit(clusData)

print(kmeans.labels_)

#import matplot lib
import matplotlib.pyplot as plt

#create scatter plot
plt.scatter(clusData["SepalLengthCm"], clusData["SepalWidthCm"],c=kmeans.labels_)


#add scatter title
plt.title("Iris Clusters")

#Write back the kmeans labels to the workflow downstream
#First need to convert to Pandas DF

import pandas as pd

#labelsDf = pd.DataFrame(data=kmeans.labels_)

data['labels'] = kmeans.labels_
Alteryx.write(data,1)

 

 

 

The sample workflow + code injection macro is attached. By all means, you can play some more with this and make it more to your liking.

 

Hope you find this useful 🙂

 

Big shout out to @PetrT again for trying to hide this one away from me for not that long actually. Cheers!

 

image.png

DM

 

David Matyas
Sales Engineer
Alteryx
9 REPLIES 9
DermotOB
5 - Atom

Hi, thanks so much for sharing. I was wondering why the input data is Text input and not just Input data tool. Also is it possible to get it working with Input data tool, since Text input has a 1000 row limit? When i attempt to use the Input Data tool instead of Input Text I get this error:

 

Error: Python Script Runner (1): Tool #10: ---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-4b0cb414fd18> in <module>
44
45 code_path = os.path.join(workflow_dir, folder, script_file)
---> 46 exec(open(code_path).read(), globals())
<string> in <module>
<string> in main()
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
3589 else:
3590 values = self.astype(object).values
-> 3591 mapped = lib.map_infer(values, f, convert=convert_dtype)
3592
3593 if len(mapped) and isinstance(mapped[0], Series):
pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
<string> in <lambda>(x)
TypeError: object of type 'NoneType' has no len()

End: Designer x64: Finished running Python InjectionV2.yxwz in 28.2 seconds with 1 error and 11 field conversion errors

 

Any help would be much appriciated.

aricukier
5 - Atom

TypeError: object of type 'NoneType' has no len()

 

I received this same error trying to test for a NULL value in an API call using the Python module.  I thought I would be clever and just see if len(variable) > 0 but I got the error.  What I had to do instead was test "variable is not NONE " and it worked.

 

Not sure if this fixes your entire issue, but hopefully it helps.

H3MINGW4Y
5 - Atom

Thanks for your amazing work!

 

 

 

 

 

So far I've tried to provide a folder name as folder input and an absolute path to it, neither of those worked. 

 

DavidM
Alteryx
Alteryx

Hey @H3MINGW4Y,

 

I think this is primarily the issue with the code itself rather than the injection. But i may be wrong.

 

Can you try to load up the Python code into the Python code tool manually first before trying the injector? Does the code work?

 

d

David Matyas
Sales Engineer
Alteryx
H3MINGW4Y
5 - Atom

Hey 

 

 

H3MINGW4Y
5 - Atom

Ok, I figured out what was wrong. There was no input for one of the connections to which I was referring in my script. So it seems that, when reading cached data it is quite useful to implement error handling that will return more friendly and accurate information than RuntimeError.

 

@DavidM 

I would like to ask what the "code injection" means exactly? 

I checked the .xml content of my workflow where I'm using the Python Script Runner Macro and the macro itself and it seems that the code is not literally embedded there as it is when using Python Tool. 

 

Thanks,

Adrian

 

 

DavidM
Alteryx
Alteryx

Hi @H3MINGW4Y,

 

Cheers for sharing the fix regarding missing connections.

 

Regarding the question about what injection means - the original article points out that "What actually happens when you hit run in your DESIGNER? This code just simply finds&reads your PY script and executes it from the Python Code tool.".

 

Python code is never part of the XML. It sits on your machine in a PY file. At the time of execution of the workflow, the latest version of the PY file is loaded up into the Python code tool and executed.

 

The main idea behind this is that you don't have to distribute newer versions of the PY script into the worklfow, i.e. say the project you are working on is under VCS like GIT for instance. You can just have the GIT repo somewhere on your machine and push the code into the Python code tool with the process showed in the post then.

 

d

David Matyas
Sales Engineer
Alteryx
cam_w
11 - Bolide

Hi @DavidM

 

Thanks for the post! 🙂

 

What's the benefit of running the Python Tool inside a macro in this situation? Is there an execution benefit, or just a way to share quickly with colleagues?

 

Also, is there a benefit to using exec() over the Alteryx.importPythonModule()?

 

Regards!

DavidM
Alteryx
Alteryx

@cam_w there is operationalization benefit mainly. no need to push Py scripts into the Python Code tool by copy-pasting.

 

i.e. if you utilize something like code repo with Git or similar you can just keep all your Py projects under that repo and push the code from there into Python tool with the injection.

 

d

David Matyas
Sales Engineer
Alteryx
Labels