This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
A workflow that includes a Python tool may throw an error of the form "Error: unable to read data (C:\AppData\Alteryx\Engine\Engine_23200_be6a9480b4fc4e038a8668b82debdf74_\aa37b5ac-6323-472b-8f0f-5cb0b95b822e\4460abb7be83bae8f01b9bf1238a923c.yxdb)"
When your Python libraries don't work the way they should in Python tool, restoring the tool to it's original state could be the solution. This article walks through how to restore Python libraries and the virtual environment associated with the Python tool.
With the Python Tool, Alteryx can manipulate your data using everyone’s favorite programming language - Python! Included with the tool are a few of pre-built libraries that extend past even the native Python download. This allows you to extend your data manipulation even further than one could ever imagine. The libraries installed are listed here - and below I’ll go into a bit more detail on what and why these libraries are so useful.
Each library is well documented, and there’s usually an introduction or examples on their sites to get you started on how a basic function in their library works.
ayx – Alteryx API – simply enough, we’re using Alteryx, sooo yea, kind of a requirement for the translation between Alteryx and Python.
jupyter – Jupyter metapackage – If you’ve used a Jupyter notebook in the past, you’ll notice the interface for the Python Tool is similar. This interface allows you to run sections of code outside of actually running the workflow, which makes understanding and testing your data that much easier.
matplotlib – Python plotting package – Any charting, plotting, or graphical needs you would want will be in this package. This provides a great deal of flexibility for whatever you want to visualize.
numPy – NumPy, array processing for numbers, strings, records, and objects – Native Python processes data in what some would call a cumbersome way. For instance, if you wanted to make a matrix, a.k.a. a 4x4 table, you would need to create a list within a list, which can slow processing a bit. However, NumPy has its own “array” type that fits the data in this matrix pattern that allows for faster processing. Additionally, it has a bunch of methods of handling numbers, strings, and objects that make processing a whole lot easier and a whole lot faster.
pandas – Powerful data structures for data analysis, time series, and statistics – This is your staple for handling data within Alteryx. Those who have used Python, but never pandas, will enter a whole new beautiful world of data handling and structure. Data manipulation within Python is faster, cleaner, and easier to code with. The best part about it is that the Python Tool will read in your Alteryx data as a pandas data frame! Understanding this library should be one of the first things to know when tackling the Python code.
geopandas – Extends the data types used by pandas to allow spatial operations on geometric types. Are you interested in geospatial analysis using Python? Try this package. It makes working with geospatial data in Python much easier and faster.
requests – Python HTTP for Humans – for all the connector/Download Tool fans out there. If any of you are familiar with making HTTP requests (API calls and the like), then you should introduce yourselves to this package and explore how Python performs these requests.
scikit-learn – a set of Python modules for machine learning and data mining – Welcome to the world of machine learning in Python! This library is your go-to for statistical and predictive modeling and evaluation. Any crazy and wild methods you’ve learned for machine learning will most likely be found here and can really push the boundaries of data science.
scipy – Scientific Library for Python – all your scientific and technical computing can be found here. This library builds off the packages already installed here, like numPy, pandas, and matplotlib. Dealing with mathematical models and formulae are usually located within this library and can help provide that higher level analysis of your data.
six – Python 2 and 3 compatibility utilities – For those who are unfamiliar, Python versions come in 2 forms, version 2.x and 3.x (with 3.x being the most recent). Now, even though Python 3 is supposed to be the latest and greatest, there are still many users out there who prefer using Python 2. Therefore, integration between the two is a bit tricky with syntax differences, etc. The six module provides functions that are usable between the two so everyone can remain calm and happy! Their documentation is usually coupled with which version the functions most closely align to, so a user can get a better idea to its functionality.
SQLAlchemy – Database Abstraction Library – SQL in Python! Covers all your database needs from connecting to and extracting data, allowing it to interact with your Python code and thus, Alteryx itself.
statsmodels – statistical computations and models for Python – This library builds off sci-kit learn but focuses more on statistical tests and data exploration. Additionally, it utilizes R-style formulae with pandas data frames to fit models!
These are the libraries installed with the Python Tool, which can do almost any data function imaginable. Of course, if you’re looking to do something that these libraries don’t provide, there are myriad other Python libraries that I’m sure will help you with your use case. Most of these are also well documented in how to use so search away and let your mind float away in the beautiful cosmos created by Python.
Python: ModuleNotFoundError: No module named 'ayx'
When running a workflow using the Python tool, you may see the error below:
ModuleNotFoundError: No module named 'ayx'
The error may appear when running your code interactively in the Jupyter notebook:
It may also appear when running your workflow
Product - Alteryx Python Tool
This error occurs when the ayx module that is required for Designer's integration with Jupyter notebook cannot be found. It may have been overwritten by other modules installed by the user.
Follow the steps in the article below to reset your Python tool to its original state. This article will walk you through re-installing the required packages using the requirements.txt file included with Designer.
Reset the Python tool
Saving or running the workflow in the Designer causes the following error to occur:
An Unhandled Exception occurred. A previous action may not have completed successfully. Click OK to send the development team the error log so that we can fix this error in a subsequent release.
Checking the logs from %PROGAMFILES%\Alteryx\ErrorLogs\AlteryxGUI shows the following error:
Alteryx Designer x64 - 2019.2.5.62427 Type: System.ArgumentException Message: Cannot have ']]>' inside an XML CDATA block. Source: System.Xml OS Version: Microsoft Windows NT 6.2.9200.0 OS Is x64 Capable: True Selected Plugin: LockInGui.LockInSelect.LockInSelect Processor: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz Private Memory: 380923904 -------------------------------------------- at System.Xml.XmlTextWriter.WriteCData(String text) at System.Xml.XmlElement.WriteElementTo(XmlWriter writer, XmlElement e) at System.Xml.XmlElement.WriteContentTo(XmlWriter w)
The use of "]]>" in the following tools (not an exhaustive list) causes the error:
Report text tool
Avoid using "]]>" in the tool or escape the ">" to ">".
FAQ: Getting started with Jupyter Notebook What is Jupyter Notebook? Jupyter Notebook is an open-source application used for statistical modeling, machine learning, data transformation, and other data science purposes. The application is also referred to as Jupyter. The notebooks can contain documentation read by users, executable code such as Python that is run for data analysis, as well as the results of an interactive session. Notebooks can include output such as HTML, images, video, and plots. Notebooks are processed by a computational engine called a kernel. Starting in release 2018.3 the Python Tool includes a customized version of Jupyter that allows you to run Python code directly in Alteryx Designer. Because it is used within the constraints of Designer, some options may not behave the same as a standard online Jupyter Notebook. For details see the Python Tool Help Page: Python Tool. What if I cannot configure a Jupyter notebook in the Python Tool? Try clicking on any blank space on the workflow canvas, then back on to the Python Tool to make it available for configuration. It may be necessary to do this several times if the Python Tool is not responding at first. Each time you do this, the Python Tool attempts to connect to Juypter again. A proxy server is blocking use of Jupyter notebooks what should I do? When proxy credentials are needed for Designer, the username and password can be configured by clicking on the Options menu and selecting User Settings, Edit User Settings, Advanced tab, and then Proxy Settings. If that does not resolve the issue, you can try clicking in the Windows Search bar, type Internet Options, and press the Enter key. On the Connections tab, click the LAN Settings button. Check the box for “Use a proxy server for your LAN”, and then enter the proxy server address and port. Also, check the box for “Bypass proxy server for local addresses”. If a proxy auto-config (PAC) file is in use, you may need to check settings in the file as it defines how web browsers and other user agents can automatically choose the appropriate proxy server. What if Jupyter Notebook is still not working? Check to see if a jupyter_server.log file exists in the Alteryx default temporary folder. To locate this folder, click on the Options menu and select User Settings, Edit User Settings, and then the Default tab. For more information, please see: How To: Obtain Web Configuration (Jupyter) Logs for the Python tool. Is connection debugging available? Yes, try enabling the CEF Developers Tools using the instructions listed here:Debugging the CEF. Afterwards, add a Python tool to the Designer canvas and check the Console tab in the AlteryxCEF DevTools window that appears. This should verify if there are any errors connecting to Jupyter. How can I avoid caching issues with notebooks? If you open 2 separate instances of Designer for testing each with a New Workflow1 workbook, and the Python Tool as tool #1, the notebook would get shared between both workflows. To avoid this issue, save each workflow with a unique name. What if I get a File not found error? Try saving the workflow and running it again as listed in the Python Tool Mastery article on Community. Additional Resources Python Tool Tool Mastery Python Jupyter Notebook Quick Start Guide Python Tool Doesn't Show Any Results or Errors on Run Python Tool Libraries - An Introduction to Python How To: Use Alteryx.installPackages() in Python tool How to reset the Python tool