This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Upgrading Server to version 22.1? We’ve enabled the faster AMP engine by default. Read about how these system settings changes might affect you.
2022-05-24 Updates: Login: If you are facing issues with logging in, please try clearing all your cookies or trying another browser. We have a fix being deployed on the evening of 5/25. Email: If you're not seeing emails be delivered from the Community, please check your spam and mark the Community emails as not junk. Thank you for your patience.
Thought I'd share a little trick that was very handy when trying to diagnose and fix a problem I was having when trying to work on a tensorflow based tool with the Python SDK.
It looks like it's choking on the attempt to import tensorflow.
Hmmmmmmm, this is interesting, because I know Tensorflow doesn't have any issues when trying to import from a standard Python session from the SDK.
Imports with no issues from a standard Python session.
What's going on here, and how do I dig in and solve the problem?
The key thing to understand is that when a Python SDK tool's code is executed, it is run in a special Python process embedded in Alteryx's main C++ process. This provides a number of huge advantages, by allowing the SDK's "plumbing" oriented operations to be performed by low-level, lightweight, and efficient C++ code, leaving the pure Python for the unique functionality provided by the tool.
The key disadvantage is that most Python IDEs don't support embedded interpreters, leaving you with a smaller tool set for debugging your code. Additionally, there isn't a way to fire up a REPL that runs C++ with an embedded Python interpreter to help debug code in isolation. Additionally, certain libraries may rely on environmental variables that aren't available in an embedded process, and these won't readily import without errors.
The good news:
Python comes with a built-in debugger called pdb. Take a wild guess what it stands for.
Let's take a spin with pdb and see if we can fix our Tensorflow error.
First, we comment out the tensorflow import line, and add an import of pdb library. Then we add a line below to tell pdb to set an interactive break point.
Next, after saving our edited python file, we use the AlteryxEngineCmd application to run a workflow which contains the tool we are working on. As you can see, it pauses at the line where we placed the pdb.set_trace() command, and we see a REPL with the (Pdb) prefix.
Now that we have a REPL available to us in the context of the embedded Python process, let's try importing Tensorflow again.
It looks like the tensorflow library is expecting the sys.argv attribute to be present. This is the kind of failure we expect in an embedded process. It's expecting to be run in a non-embedded Python process which has access to environment variables like sys.argv.
No surprise here. Tensorflow wants a variable that is present in the non-embedded process, but not available in our embedded Python.
Let's see if we can fix that, shall we?
First, let's fire up the non-embedded Python REPL again, and take a look at the sys.argv variable.
Our non-embedded Python process has a list with a single, empty string assigned to sys.argv.
We know that Tensorflow successfully imports in the non-embedded process, and that one delta is the lack of this variable in the embedded process. Let's launch the workflow again, and do a quick test in the pdb REPL to see if we can learn more.
First, we import the sys library. Then we set the value of the sys.argv attribute to the value we saw in the non-embedded process. We try to import Tensorflow, and it works.
As you can see, the sys.argv variable is the "A/B switch" for the failed Tensorflow import.
As a pragmatist, I simply add the lines in to my plugin to give Tensorflow what it wants.
Add in the sys.argv variable that Tensorflow wants to our plugin code.
The tool now runs without incident.
Here, we can now see that the problem was solved.
Another neat trick to keep in mind is that this is a great tool for exploring the Python SDK's core data objects and methods, as well as the data being passed around to various methods.
Here we add the pdb.set_trace() command into the body of the pi_init function.
Now that we've added the set_trace command, let's run the workflow again from the command prompt, and inspect the contents of the str_xml variable passed into the method.
Here, we can see the contents of the str_xml passed to AlteryxEngine from the config window of the tool.
Hopefully, this is useful to all of you folks out there hacking away on the Python SDK.
Looking forward to seeing you build great things with it.
JP Kabler Lead Software Engineer, Assisted Modeling Alteryx
Hi @pavloko, the Python SDK is interacting with the engine, so in order to test changes made within your Python code, you will need to run the tool in a workflow (you shouldn't have to reopen the workflow each time). Some Python backend errors may appear when you click on and off of the tool and they are likely related to the initialization of the tool.