This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
More and more during my sessions with our customers, I am getting questions about utilizing existing python scripts with Alteryx Designer Python Code tool.
This especially whenever developers/ coders/ data scientists use some version of VC/ GIT to manage their Python code repos.
How can we automate a load of a py script into the Python Code tool in Alteryx Designer (de-facto inject the code into the tool)?
Note: As much as a would like to take all the credit for this one and say I spend many a sleepless night putting this together, this bit was actually created by the engineering team in Alteryx @PetrT
Oh wait - I did find this picture above. And created the macro around it... And tried to be (unsuccessfully) funny about it.
Here goes the workflow simple sample of a workflow utilizing that Python Script Runner/ Injector macro:
0/ The macro itself has two inputs, one for the file info (F), one for data (D) which is left optional
1/ Take FOLDER where your PY script is saved (could be relative or absolute) and FILE name of your script (PY format)
2/ Take DATA in the tabular format from Alteryx workflow to feed to the script (you can play with the macro to expand to multiple inputs)
3/ OUTPUT of the macro streams out the data from the Python Code tool
The macro itself is relatively simple, all that magic really happens in the injection script within that Python code tool
The code is here
# List all non-standard packages to be imported by your
# script here (only missing packages will be installed)
from ayx import Package
from ayx import Alteryx
df = Alteryx.read("#1")
# Load the params from the input
folder = "" #Placeholder for folder with a script
script_file = "" #Placeholder for file with a script
for index, row in df.iterrows():
folder = row
script_file = row.replace("\\","/")
from ayx import Alteryx
import xml.etree.ElementTree as ET
workflow_dir = os.path.normpath(os.path.normcase(Alteryx.getWorkflowConstant("Engine.WorkflowDirectory")))
temp_dir = os.path.normpath(os.path.normcase(os.environ['TEMP']))
except Exception as e:
print('Cannot determine temp dir. ' + str(e))
temp_dir = None
if workflow_dir is not None and temp_dir is not None:
if os.path.samefile(workflow_dir, temp_dir):
is_debug_workflow = True
is_debug_workflow = False
is_debug_workflow = None
debug_workflow_path = os.path.join(temp_dir, 'debug_temp.yxmd')
str_xml = open(debug_workflow_path, "r", encoding='utf-16').read()
except Exception as e:
print('Cannot parse debug workflow: ' + str(e))
str_xml = None
workflow_xml = ET.fromstring(str_xml)
for element in workflow_xml.iter('Node'):
if element.find('GuiSettings').attrib['Plugin'] == 'AlteryxGuiToolkit.TextBox.TextBox':
debug_text_box = element.find('Properties').find('Configuration').find('Text').text
original_workflow = debug_text_box[
debug_text_box.find('<Module>') + 8:debug_text_box.find('</Module>')]
workflow_dir = re.search(r'(.*)\\(.*)', original_workflow).group(1)
print('workflow_dir: ' + str(workflow_dir))
code_path = os.path.join(workflow_dir, folder, script_file)
Actually, all that magic boils down to the last two lines which just simply load the PY script from your file system location using those FOLDER and SCRIPT FILE inputs.
Why all the code then? To fix this across situations like running the workflow in DEBUG mode, from TEMP location, etc.
What actually happens when you hit run in your DESIGNER? This code just simply finds&reads your PY script and executes it from the Python Code tool.
Note: You may wonder how my Iris Script I am running this way looks like, especially to handle the input data and output data connections.
Notice that the macro is built now in such a way to get DATA from Connection #2. And write back is based again just how you define the Alteryx.write() function.
from ayx import Alteryx
#read the iris dataset
data = Alteryx.read("#2")
#print out input data
#import scikit learn kmeans package
from sklearn.cluster import KMeans
#subset data to features used for clustering
clusData = data [['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']]
#fit subset feature to 3 clusters
kmeans = KMeans(n_clusters=3, random_state=0).fit(clusData)
#import matplot lib
import matplotlib.pyplot as plt
#create scatter plot
#add scatter title
#Write back the kmeans labels to the workflow downstream
#First need to convert to Pandas DF
import pandas as pd
#labelsDf = pd.DataFrame(data=kmeans.labels_)
data['labels'] = kmeans.labels_
The sample workflow + code injection macro is attached. By all means, you can play some more with this and make it more to your liking.
Hope you find this useful 🙂
Big shout out to @PetrT again for trying to hide this one away from me for not that long actually. Cheers!
Hi, thanks so much for sharing. I was wondering why the input data is Text input and not just Input data tool. Also is it possible to get it working with Input data tool, since Text input has a 1000 row limit? When i attempt to use the Input Data tool instead of Input Text I get this error:
Error: Python Script Runner (1): Tool #10: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-3-4b0cb414fd18> in <module> 44 45 code_path = os.path.join(workflow_dir, folder, script_file) ---> 46 exec(open(code_path).read(), globals()) <string> in <module> <string> in main() c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds) 3589 else: 3590 values = self.astype(object).values -> 3591 mapped = lib.map_infer(values, f, convert=convert_dtype) 3592 3593 if len(mapped) and isinstance(mapped, Series): pandas\_libs\lib.pyx in pandas._libs.lib.map_infer() <string> in <lambda>(x) TypeError: object of type 'NoneType' has no len()
End: Designer x64: Finished running Python InjectionV2.yxwz in 28.2 seconds with 1 error and 11 field conversion errors
I received this same error trying to test for a NULL value in an API call using the Python module. I thought I would be clever and just see if len(variable) > 0 but I got the error. What I had to do instead was test "variable is not NONE " and it worked.
Not sure if this fixes your entire issue, but hopefully it helps.
Ok, I figured out what was wrong. There was no input for one of the connections to which I was referring in my script. So it seems that, when reading cached data it is quite useful to implement error handling that will return more friendly and accurate information than RuntimeError.
I would like to ask what the "code injection" means exactly?
I checked the .xml content of my workflow where I'm using the Python Script Runner Macro and the macro itself and it seems that the code is not literally embedded there as it is when using Python Tool.
Cheers for sharing the fix regarding missing connections.
Regarding the question about what injection means - the original article points out that "What actually happens when you hit run in your DESIGNER? This code just simply finds&reads your PY script and executes it from the Python Code tool.".
Python code is never part of the XML. It sits on your machine in a PY file. At the time of execution of the workflow, the latest version of the PY file is loaded up into the Python code tool and executed.
The main idea behind this is that you don't have to distribute newer versions of the PY script into the worklfow, i.e. say the project you are working on is under VCS like GIT for instance. You can just have the GIT repo somewhere on your machine and push the code into the Python code tool with the process showed in the post then.