This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
When your Python libraries don't work the way they should in Python tool, restoring the tool to it's original state could be the solution. This article walks through how to restore Python libraries and the virtual environment associated with the Python tool.
With the Python Tool, Alteryx can manipulate your data using everyone’s favorite programming language - Python! Included with the tool are a few of pre-built libraries that extend past even the native Python download. This allows you to extend your data manipulation even further than one could ever imagine. The libraries installed are listed here - and below I’ll go into a bit more detail on what and why these libraries are so useful.
Each library is well documented, and there’s usually an introduction or examples on their sites to get you started on how a basic function in their library works.
ayx – Alteryx API – simply enough, we’re using Alteryx, sooo yea, kind of a requirement for the translation between Alteryx and Python.
jupyter – Jupyter metapackage – If you’ve used a Jupyter notebook in the past, you’ll notice the interface for the Python Tool is similar. This interface allows you to run sections of code outside of actually running the workflow, which makes understanding and testing your data that much easier.
matplotlib – Python plotting package – Any charting, plotting, or graphical needs you would want will be in this package. This provides a great deal of flexibility for whatever you want to visualize.
numPy – NumPy, array processing for numbers, strings, records, and objects – Native Python processes data in what some would call a cumbersome way. For instance, if you wanted to make a matrix, a.k.a. a 4x4 table, you would need to create a list within a list, which can slow processing a bit. However, NumPy has its own “array” type that fits the data in this matrix pattern that allows for faster processing. Additionally, it has a bunch of methods of handling numbers, strings, and objects that make processing a whole lot easier and a whole lot faster.
pandas – Powerful data structures for data analysis, time series, and statistics – This is your staple for handling data within Alteryx. Those who have used Python, but never pandas, will enter a whole new beautiful world of data handling and structure. Data manipulation within Python is faster, cleaner, and easier to code with. The best part about it is that the Python Tool will read in your Alteryx data as a pandas data frame! Understanding this library should be one of the first things to know when tackling the Python code.
geopandas – Extends the data types used by pandas to allow spatial operations on geometric types. Are you interested in geospatial analysis using Python? Try this package. It makes working with geospatial data in Python much easier and faster.
requests – Python HTTP for Humans – for all the connector/Download Tool fans out there. If any of you are familiar with making HTTP requests (API calls and the like), then you should introduce yourselves to this package and explore how Python performs these requests.
scikit-learn – a set of Python modules for machine learning and data mining – Welcome to the world of machine learning in Python! This library is your go-to for statistical and predictive modeling and evaluation. Any crazy and wild methods you’ve learned for machine learning will most likely be found here and can really push the boundaries of data science.
scipy – Scientific Library for Python – all your scientific and technical computing can be found here. This library builds off the packages already installed here, like numPy, pandas, and matplotlib. Dealing with mathematical models and formulae are usually located within this library and can help provide that higher level analysis of your data.
six – Python 2 and 3 compatibility utilities – For those who are unfamiliar, Python versions come in 2 forms, version 2.x and 3.x (with 3.x being the most recent). Now, even though Python 3 is supposed to be the latest and greatest, there are still many users out there who prefer using Python 2. Therefore, integration between the two is a bit tricky with syntax differences, etc. The six module provides functions that are usable between the two so everyone can remain calm and happy! Their documentation is usually coupled with which version the functions most closely align to, so a user can get a better idea to its functionality.
SQLAlchemy – Database Abstraction Library – SQL in Python! Covers all your database needs from connecting to and extracting data, allowing it to interact with your Python code and thus, Alteryx itself.
statsmodels – statistical computations and models for Python – This library builds off sci-kit learn but focuses more on statistical tests and data exploration. Additionally, it utilizes R-style formulae with pandas data frames to fit models!
These are the libraries installed with the Python Tool, which can do almost any data function imaginable. Of course, if you’re looking to do something that these libraries don’t provide, there are myriad other Python libraries that I’m sure will help you with your use case. Most of these are also well documented in how to use so search away and let your mind float away in the beautiful cosmos created by Python.
Saving or running the workflow in the Designer causes the following error to occur:
An Unhandled Exception occurred. A previous action may not have completed successfully. Click OK to send the development team the error log so that we can fix this error in a subsequent release.
Checking the logs from %PROGAMFILES%\Alteryx\ErrorLogs\AlteryxGUI shows the following error:
Alteryx Designer x64 - 2019.2.5.62427 Type: System.ArgumentException Message: Cannot have ']]>' inside an XML CDATA block. Source: System.Xml OS Version: Microsoft Windows NT 6.2.9200.0 OS Is x64 Capable: True Selected Plugin: LockInGui.LockInSelect.LockInSelect Processor: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz Private Memory: 380923904 -------------------------------------------- at System.Xml.XmlTextWriter.WriteCData(String text) at System.Xml.XmlElement.WriteElementTo(XmlWriter writer, XmlElement e) at System.Xml.XmlElement.WriteContentTo(XmlWriter w)
The use of "]]>" in the following tools (not an exhaustive list) causes the error:
Report text tool
Avoid using "]]>" in the tool or escape the ">" to ">".
FAQ: Getting started with Jupyter Notebook What is Jupyter Notebook? Jupyter Notebook is an open-source application used for statistical modeling, machine learning, data transformation, and other data science purposes. The application is also referred to as Jupyter. The notebooks can contain documentation read by users, executable code such as Python that is run for data analysis, as well as the results of an interactive session. Notebooks can include output such as HTML, images, video, and plots. Notebooks are processed by a computational engine called a kernel. Starting in release 2018.3 the Python Tool includes a customized version of Jupyter that allows you to run Python code directly in Alteryx Designer. Because it is used within the constraints of Designer, some options may not behave the same as a standard online Jupyter Notebook. For details see the Python Tool Help Page: Python Tool. What if I cannot configure a Jupyter notebook in the Python Tool? Try clicking on any blank space on the workflow canvas, then back on to the Python Tool to make it available for configuration. It may be necessary to do this several times if the Python Tool is not responding at first. Each time you do this, the Python Tool attempts to connect to Juypter again. A proxy server is blocking use of Jupyter notebooks what should I do? When proxy credentials are needed for Designer, the username and password can be configured by clicking on the Options menu and selecting User Settings, Edit User Settings, Advanced tab, and then Proxy Settings. If that does not resolve the issue, you can try clicking in the Windows Search bar, type Internet Options, and press the Enter key. On the Connections tab, click the LAN Settings button. Check the box for “Use a proxy server for your LAN”, and then enter the proxy server address and port. Also, check the box for “Bypass proxy server for local addresses”. If a proxy auto-config (PAC) file is in use, you may need to check settings in the file as it defines how web browsers and other user agents can automatically choose the appropriate proxy server. What if Jupyter Notebook is still not working? Check to see if a jupyter_server.log file exists in the Alteryx default temporary folder. To locate this folder, click on the Options menu and select User Settings, Edit User Settings, and then the Default tab. For more information, please see: How To: Obtain Web Configuration (Jupyter) Logs for the Python tool. Is connection debugging available? Yes, try enabling the CEF Developers Tools using the instructions listed here:Debugging the CEF. Afterwards, add a Python tool to the Designer canvas and check the Console tab in the AlteryxCEF DevTools window that appears. This should verify if there are any errors connecting to Jupyter. How can I avoid caching issues with notebooks? If you open 2 separate instances of Designer for testing each with a New Workflow1 workbook, and the Python Tool as tool #1, the notebook would get shared between both workflows. To avoid this issue, save each workflow with a unique name. What if I get a File not found error? Try saving the workflow and running it again as listed in the Python Tool Mastery article on Community. Additional Resources Python Tool Tool Mastery Python Jupyter Notebook Quick Start Guide Python Tool Doesn't Show Any Results or Errors on Run Python Tool Libraries - An Introduction to Python How To: Use Alteryx.installPackages() in Python tool How to reset the Python tool
Alteryx Designer comes with tools (based on both R and Python) to create and use predictive models without needing to write any code. But what if you've got custom models written in R or Python outside of Designer that you want to use in Designer, or vice versa?
UPDATE: After reviewing the decision to deprecate the C# (.NET) SDK, we are committed to the C# (.NET) SDK working as it currently does in Alteryx Designer and Server for at least the next two years. This change is reflected in the specified time frame below.
tl;dr The C# (.NET) SDK will be deprecated in December 2021. Users currently leveraging the C# SDK for custom tools will need to convert their tools to a different backend (C++, Python, or macro) ahead of its deprecation. As part of this, the UI of the tool will also need to be reconfigured using the HTML GUI SDK, or macro if using a macro. FAQ What Is Happening? The C# (.NET) SDK will be deprecated in December 2021. See the 'Next Steps' section for more information. What is the C# SDK? A way for .NET developers to create their own custom tools for Alteryx. Documentation can be found here: [installdirectory]\Alteryx\APIs\SampleCode\DotNetCustomTools.pdf Why is this changing? Alteryx is putting more resources into improving the C++ and Python SDKs.
Who's Impacted? Anyone who is leveraging the C# (.NET) SDK. Who's Not Impacted? Anyone who is not utilizing the C# (.NET) SDK. What should we use instead? The Python SDK, C++ SDK, or macro backend (https://help.alteryx.com/developer/current/BuildCustomTools.htm). Next steps? The C# (.NET) SDK will be deprecated in December 2021. Custom tools utilizing the C# (.NET) SDK might continue to work on Windows, but not all existing functionality will be available and you should consider converting the backend of your tools to something that will be supported, such as Python, C++, or macro. The UI of the tool will also need to be reconfigured using the HTML GUI SDK, or a macro (Interface tools) if using a macro. Links to backend options documentation: Python, C++, or macro. Links to frontend options documentation: macro or HTML GUI SDK documentation. Questions?
Contact firstname.lastname@example.org Alter Everything!
The Dynamic Rename Tool is part of the developer category of tools. It allows the user to quickly rename any or all fields within an input stream by employing the use of different methods.
The user has the option to rename only certain fields, all fields, or even dynamic/unknown fields at runtime (e.g. after a Cross Tab Tool). The option for renaming fields are:
This article is part of the CS Macro Development Series. The goal of this series is to communicate tips, tricks, and the thought process that goes into developing good, dynamic macros.
Suppose you have a dataset that will pass through a macro if a condition is true, and an entirely different macro if a condition is false. In the event the condition is true for all records, no records will be sent to the false-side macro. The false-side macro is expecting data and throws an error whenever it doesn't find any. Similarly, in a case where the condition is false for all records, no records will sent to the true-side macro and errors. You need a process that will bypass the macros whenever data is unavailable. Let’s see how you can do that.
First, create data senarious where all conditions can be tested. In the first situation, I assigned a field, Sum_Test, with a value of 1 to half of the records and a value 0 to the other half. A second situation assigns a value of 1 to all records. A third situation assigns a value of 0 to all records. Setting up these situations will allow me to test all possibilities.
The next step is to filter by condition. Depending which data senario from above you use, data may not exist if the condition is true or false. No data means your workflow will fail. We need a work-around so that doesn't happen.
What happens after the data goes down the true side or the false side is essentially the same in terms of process. The batch macro below is found downstream from the true side as well as the false side. Functionally, they work the same. The batch macro determines if data is available. If there is, data is detoured to the formula tool. If not, the detour will bypass the formula tool entirely and keep the workflow from throwing a error.
NOTE: the formula tool in the illustration below is a representation of any process that requires data if an error is to be avoided. This could anything, usually another macro. For the purpose of this illustration, I'm simply showing a single tool.
What follows are instructions for how each tool in the batch macro/detour combination are configured.
First, enter ‘Sum_Test’ as the label for the Control Parameter.
Then write an expression in the Condition tool that checks if ‘Sum_Test’ is null.
If ‘Sum_Test’ is null (True), then direct the detour tool to go to the right in the ‘T’ side action box.
Similarly, if ‘Sum_Test’ is not null (False), then direct the detour tool to go to the left in the ‘F’ side action box.
Connect both action boxes to a Detour tool. The Detour tool has no configuration.
Every Detour must be stopped by a Detour End tool.
Build a similar batch macro for the false side of Filter tool.
Union the results from the true and false sides.
I put a Frequency Table tool after the Union to verify the results.
The entire process looks like this:
Disconnect the input to the filter tool and connect a new test condition to test all the various conditions.
To find the full path and filename of a saved file in Excel, you use the =CELL function.
In Alteryx, you use a Field Info Tool to get this information:
The Field Info Tool allows you to see in tabular form the name of fields in a file as well as the field order, field type, and field size.
Name: field names within the file
Type: type of data field
Size: length of a data field
Scale: with respect to fixed decimal data types, scale refers to the digits of precision
Source: contains the full path and filename
Description: may or may not contain information; you can add a description via the Select tool
We're only interested in the Source field and this information will be the same for each field.
Using a Sample Tool, we select just the first record:
Notice the data in Source begins with 'File:'. We don't want that in the final output so we'll use a SUBSTRING function in the expression of a Formula tool to clean it up. Complete the workflow with a Select tool so we only get the Source field:
I'll mention here you can use the Directory Tool to find the full path and filenames in a directory.
Select the directory you want to search. File Specification has wildcard characters so you can limit your search to files containing specific character patterns or file types. In the example below, let's set up the File Specification to only return files with the '.xlsx' file extension:
We're only interested in the field 'FullPath' (first column) so we'll use a Select tool to drop the remaining fields.
While the Directory tool returns multiple filenames, it will not contain a worksheet name if the file is an Excel file. To get that information, you'll need to use the Field Info Tool as we did above.
Building out a workflow and find yourself stumped when trying to add needed functionality to your process? While the Designer does an incredible job of packaging just about every operation an analyst could need, you might need that extra mile. We get that. Here at Alteryx we are all about going that extra mile; if we don’t have a tool that doesn’t explicitly capture functionality for you, we try to equip you with tools that can make for an easy reach to that functionality from resources just outside of the Designer environment (see our R Tool, the API based Connector Tools, and the Run Command Tool). In this article we’ll go over an introduction on how to make and use batch files – these will easily incorporate command line based scripting into your workflows that will help you do just about everything short of feeding your dog.
In short, a batch file is a plain text file that lists a series of commands for the command line interpreter to run in Windows. They’re frequently used to make, remove, rename, move, or even copy directories or files, ping IP addresses, run other programs or services, and manipulate environment variables – and that’s hardly even the tip of the iceberg. There’s a wealth of resources online listing the different batch commands available to you, but the Ben/Peter Parker rule applies to them all: “with great power comes great responsibility.” Please use them responsibly!
To make a batch file, all you need to do is take your desired batch command(s) and write them into a text file:
Be sure to take note of the paths being used! This batch command will make a directory named “batchfolder” in whatever directory it is in when run. Before it can be run as a script, however, we have to save it as a .bat file:
In “Save As” change the “Save as type” to “All Files (*.*)” and change the “File name” .txt extension to .bat:
In your working directory you should now see a different icon, extension, and type, describing the file:
How easy was that? If you want, you can test to see if it works by navigating to the directory the batch file resides in and typing its name into the Command Prompt:
You’ll now see the directory we made with the execution of the batch file:
See, I’m not making this stuff up. Now let’s get all this set up to run for us in the designer. All you need to do is specify the .bat file name in the “Command” configuration option - by default, it will look for this file in the directory the workflow is saved in:
In the attached example, I use the Run Command Tool as an input of a test file (specified in the “Read Results”) before writing the file to the new directory made from executing the .bat file.
If the API that you are working with requires you to sign or authenticate your requests, it may utilize an implementation of OAuth 2.0 or another authentication method to show that you have the access needed to consume the web service. There are some key words that you can look for in the API documentation that you are using that will help you quickly choose the appropriate grant flow to use in Alteryx.
Now you are ready to take your workflow one step further and add some error checking. If something is wrong with your request or something goes wrong on the server, an API will usually return an error message. It is useful to capture this information and clearly display it to the user of the connector. The Message tool will come in handy here. It will help you to troubleshoot the error and actually stop the workflow from processing if you would like.
Alteryx can use the Run Command Tool to run Powershell scripts and perform any Powershell specific commands. Note that i n order to run PowerShell scripts you must make sure you have scripting enabled. You should consult with your IT department to see if you are allowed to enable this functionality.
Sometimes clients have asked how they can re-input the same excel file that they have just outputted into the same workflow. Normally, their gut instinct would be to open up a new workflow and start fresh with the updated file. However, this can be a bit cumbersome especially if they want to do this multiple times or for those wanting to do some sort of logging process in an app. Luckily, there is a quick and easy trick for this.
Attached to this article is a workflow that I go through below...
Now the initial step to this might be different depending whether or not you already have a file that you want to bring in or not. If you are using just a regular Input tool, make sure you obtain the file path by selecting “Full Path” in the “Output File Name as Field” drop down. If you are writing in a text input make sure you have a field specifying the full path. This will be the same file path where we will output the data and bring it back in. The filepaths must be consistent throughout the workflow.
After your data blending and data preparation is complete and you are ready to output the file, place a “Block Until Done” tool at the end of your stream. Make sure your first output is connected to an “Output” tool with your specified settings with the proper file path. Next, place a “Dynamic Input” tool and make sure you are reading the field with the file path and change the action dropdown to “Change Entire File Path”. For the “Input Data Source Template “ box, I just reference the same file that I am going to write to.
And that is how you output a file then re-input it within the same workflow.
How do I pass a parameter/constant in a workflow?
You can use a workflow constant to pass a parameter. Click on the Workflow - Configuration - Workflow tab. Click the + button to add a new User type constant, and give your constant a name and value!
It is quite common to hardcode values (‘constants’) in workflows using any of the following ways:
Text input tool
Variables within formula tool
Hardcode the values within formula/any other transformation tools
If the values are hardcoded within tools, especially in very large or complex workflows, maintenance can be difficult. The ‘constants’ property allows us to define all static variables in one place, so that they are accessible throughout the entire workflow.
Can you wait X seconds between processing each row in Alteryx?
Yes! Thanks to Invisio for creating an Inviso Macro Pack and posting on their blog here.
The "Wait a Second" macro lets you wait X number of seconds before processing each row in the dataset.
One application is if you are contacting an API with multiple requests. The WaitAsecond macro may help to pause the API long enough to process multiple rows without issue.
It can also be used to scrape sites without putting heavy loads on their server. An Invisio sample of scraping the Alteryx community (See Insights to the Alteryx Community)
As you can see, the part of the flow that runs through the WaitASecond tool gets a NOW timestamps which are 5 seconds a part, whereas the bottom stream, not running through the WaitASecond tool, all gets the same timestamp.
There are essentially two macros:
The first one assigns a unique id to each record and then uses that ID for the batch macro.
The batch macro has a “Command tool” that runs a ping that waits x seconds before timeout (126.96.36.199 if that exist in your network it won’t work).
The macro can be downloaded here (InvisoMacros.zip).
API connections give access to many web-based applications, database systems, or programs by exposing objects or actions to a developer in an abstracted format that can easily be integrated into another program.