Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
grossal
15 - Aurora
15 - Aurora

This is the follow-up blog article to my Alteryx Inspire Denver 2022 presentation, “The Beauty of Python Macros.” Have you ever run into an issue and thought it’s impossible to solve in Alteryx? Yes? Then come with me so I can show you the power of Python and Alteryx together!

 

Python SDK vs. Python Macro

 

First, we should clarify what a Python macro actually is. A Python macro simply is a standard Alteryx macro that has the Python tool inside it – either exclusively or combined with other tools. The Python SDK (Software Development Kit), on the other hand, is a custom development kit to build custom tools. The SDK allows you to bring in various functionalities that a macro wouldn’t allow, and it simplifies building a custom interface for the custom tool.

 

When should you use what?

 

If you want to build a feature-rich custom connector, then you should go for the Python SDK. Otherwise, Python macros cover the other 99% and are fast and easy to use.

 

2022-06-06-08-13-32-POWERPNT.png

 

The Basics

 

You don’t need to be a full-time Python developer to use the Python tool in Alteryx. For most of the tools shown below, you theoretically don’t need to know Python at all. In general, I’d say that a Python beginner course that teaches for-loops, variables, and if-else concepts is good enough. If you pack a one-hour Pandas course on top, you are more than ready to try your first steps.

 

No matter if you have used Python before or not, there are always the same three questions:

  • How do we get data in?
  • How do we get data out?
  • How do we install python packages?

 

Reading data into your Python tool can be done using the Alteryx.read(‘#1’) command. The #1 needs to be the name of the connection that you input. If you have multiple incoming connections, it is highly recommended to rename them to make your code inside the tool easier to read. Alteryx works with Pandas dataframes. Therefore the read function returns one of them, and the write function expects to receive a pandas dataframe.

 

When trying to get data out, you need to pass a dataframe object as the first argument and the output number as the second. While we allow having infinite input sources, we can only output to five output anchors, and therefore the second number has to be between 1 and 5.

 

Lastly, we need to know how to install additional packages to really get the most out of Python. There are two ways how we can do this:

  • Package.installPackages([‘Package1’,’Package 2’]) is the Alteryx recommended way of doing it.
  • !pip install Package – this is sort of a hack. You simply put an exclamation mark in front of the normal pip command and install packages.

 

2022-06-06-08-17-03-POWERPNT.png

 

What’s the difference?

 

Alteryx also uses the pip command inside the Alteryx.installPackages command, but I noticed that I often had to open Alteryx Designer as an Admin to allow the Python tool to install packages. On the other hand, the !pip install bypasses this and does not need to have admin rights, but it has downsides as well. The command only works in the interactive and not in the production mode. This might not be a problem if you simply use it on your own because you can completely remove the package install line after the installation. Still, once you share a macro and put it in production mode, you should probably go with the Alteryx intended way – even though the people would need to open Designer with Admin rights – once – to install the packages.

 

Python Tool Interface

 

The interactive mode in the Python tool uses a Jupyter notebook. In case you ever need to find help, search for "Jupyter notebook " + your question in Google. There are three things that are important for this guide:

1) The scissors icon on top is used to remove the boilerplate code because we never need them.

 

2021-02-07 15-45-33.png

 

2) Once you have written the code in one block/box, you can hit CTRL + ENTER to execute it.

3) If you want to open a new code block, you can click on the plus icon next to the scissor icon that we marked in the screenshot above.

 

The Magic

 

Let’s take a look at a couple of examples that were shown at my Inspire presentation.

 

Sending Smartphone notifications

 

For this one, I’d like to pass you on to my previous Python article, which explains how to build a pushbullet tool for notifications in all its details. This article will focus on tools that have not been published before. Read it here.

 

Create ZIP file

 

Why and when do we need it?

 

I had two reasons when building the tool:

  • E-Mail attachments that were simply so big that they stopped my mail from being delivered.
  • Too many e-mail attachments. This is not really a problem, but it doesn’t look good if you send 20 attachments in a single mail.

 

How do we build it?

 

If you have not read my article on the smartphone notification, you might be surprised, but the answer is: we simply google it. Search for “Python create zip file ” and pick the first one.

 

 

import shutil
shutil.make_archive(output_filename, 'zip', dir_name)

 

 

The best part about that is that we don’t need to install a package because shutil is part of the default packages that are already installed with Python. Therefore, we can simply copy and paste the code. Afterward, we can use two control parameters with action tools to replace output_filename (file path without .zip at the end) and dir_name (the folder you'd like to zip), and we are set and done if we keep it simple. Make sure to put both variables in quotation marks.

 

Afterward, this would look like this:

 

grossal_3-1654497064293.png

 

Then we can use two control parameters like this:

 

grossal_5-1654497127485.png

 

We need to select the productionModeScript value (after we have switched the Python tool to production mode) and select “Replace a specific string” at the bottom. For the first one, we replace “output_filename,” and for the second, we replace “dir_name.”              

 

grossal_4-1654497103852.png

 

Surely, we can improve this in many ways, but the first target is always: Make it work. This gets the job done without any additional lines of Python code. Would I recommend making improvements? 100%. At some point, I'll probably write another blog about it, but today's blog is about what was shown at Inspire this year.

 

Ensure File Path Exists

 

Why and when do we need it?

 

When using custom output locations using the ‘Change Entire Filepath,’ we often run into an issue that we can either only save all files to a single folder OR have to create all folders in advance. But what if we have customer folders and get a couple of new customers? The Output will error if we don’t have them created before. It would be nice if we’d automatically create all the folders before trying to write to them – right? That’s exactly what this tool does.

 

How do we build it?

 

Would you like to guess? If you said ‘Google it,’ then you are totally right. Look out for ‘Python create folder if not exists’ and pick the first one again.

 

 

import os
os.makedirs(path, exist_ok=True)

 

 

Yet again, the os package is part of the default installation, and we can copy and paste both lines.

 

Afterward, we only need to replace the path with the folder path (using another control parameter and an action tool) that we’d like to create, and we are ready to go! Like the previous example, it is important that you wrap the path variable with quotation marks before using an action tool on it.

 

Create QR Code

 

When and why do we need it?

 

This falls more into the very specific cases. I had a couple of customers who used Excel-VBA solutions or went to websites to create their QR codes. This works fine if you only need to generate a couple of QR codes, but what if you need to create 100,000? It might get just a bit overwhelming!

 

How do we build it?

 

Take a wild guess! Right. Google. You are noticing a theme here, aren’t you? I’m no magician--I’m a normal user like you are, and if I face a new problem, I google it. Search for ‘Python create QR code.’ In my case, the pypi package site showed up as the first hit. You can think of pypi as the Alteryx Gallery for Python packages.

 

The site shows us how to install it using pip (screenshot below). We can copy this line and add a ! (exclamation point) in front to install the package.

 

 

!pip install qrcode[pil]

 

 

Afterward, we can copy the following lines to create the QR code.

 

 

import qrcode
img = qrcode.make('Some data here')
type(img)   # qrcode.image.pil.PilImage
img.save("some_file.png")

 

 

The third line can be removed as it is only used to show the type of the image variable. Yet again, we can simply use two action tools to replace the text "Some data here" with the text we'd like to have in the QR code and the "some_file.png" with our file path.

 

Merge PDFs

 

When and why do we need it?

 

This was actually a request/post from one of the Alteryx community members. Due to compliance issues, they had an RPA bot that extracts single PDFs for every sheet in a dashboard. Afterward, they wanted to send a single PDF with all relevant sheets to a customer. It's probably a rather uncommon use case but let's do it anyway!

 

How do we build it?

 

I hope you can guess by now, but if not, I'll tell you one more time: Google it. My first hit included the following code snippet:

 

 

from PyPDF2 import PdfMerger

pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf']

merger = PdfMerger()

for pdf in pdfs:
    merger.append(pdf)

merger.write("result.pdf")
merger.close()

 

 

When you copy-paste this one, you'll quickly run into an error because the PyPDF2 is not installed by default. If we search for 'Python install PyPDF2', we'll quickly find the pypi-page for it.

 

The installation is as simple as before, we can add a ! in front of the shown command below, and we are good to go.

 

 

!pip install PyPDF2

 

 

For this one, it's a bit “trickier” to get the code to work. I'd recommend concatenating all file paths that you want to combine with a summarize tool and then replacing the "['file1.pdf','file2pdf'...]" expression with just a single action tool.

 

What should I take away?

 

Python is easy, and there are a lot of pre-built packages like macros in the Alteryx Gallery. You can easily install them and make them work in Alteryx. I hope this blog or my session at Inspire Denver got you interested in Python programming. 

 

What would you recommend to get started?

 

I'd recommend searching for free YouTube courses that involve:

  • Variables, Loops, If-Else
  • Pandas beginner course

 

The first one is important to get used to general concepts, and the second can be used to get familiar with Pandas DataFrames and how to work with them.

 

But ... can you please share your production macros?

 

No 😊 This blog is meant to get you started and you won't learn a single thing if you just use my macros. Try it yourself. It is not that hard and I have faith you can do it!

 

But ... I have really tried it for hours and hours and it doesn't work. Can you please share it?

No, but you can write me a DM with what you tried, and I'll be more than happy to hop on a Zoom/Teams to help you get it right.

Additionally you might just look at the Alteryx Gallery. I already saw that someone built a macro for one of the use cases shown above. Maybe someone else is kind enough to post a ready-to-use macros for you, but it won't be me. It's my goal to teach it and I really want you to learn not copy.

 

What do you usually change in a production macro?

 

It’s usually just two things. First, I combine it with a couple other Alteryx tools. E.g. the unique tool for the zips as there is no point in creating the same zip file multiple times. The second aspect is error handling and making the tool more robust. I add error messages if users pass wrong inputs (e.g. no valid file path) to my macro/script.

 

If you have made it that far, let me know your feedback and if you'd like to see a webinar on this topic :-)

 

Best,

Alex

 

P.S. Feel free to connect to me on LinkedIn.

P.P.S. Thank you @joshuaburkhow for proofreading this blog!

 

Comments
MarqueeCrew
20 - Arcturus
20 - Arcturus

Alex,

 

giphy

 

Marvelous article for those tempted to the dark side.  I like to stay with Base-A in my solutions.  I admit to using a python macro here or there, but only under extreme duress.  I'll take you up on your offer to get free python training though.  I have concerns about management and control over these packages.  But maybe I just worry too much.

 

I've stepped into so many workflows and there's enough fright that I've encountered just with Alteryx tools that coming into someone's python code can put me over the edge.  I do like the idea however of creating a python tool (macro) and linking the tool to an IDEA post in community.  Hey AYX Product team, here's what we need....

 

Thanks for taking the time to write this article and to share it with me (along with everyone else).  Thanks even more for that training.

 

Cheers,


Mark

AkimasaKajitani
17 - Castor
17 - Castor

Hi @grossal 

 

Incredible article for a great step into the ocean of Python. Of course, I also like to make the workflow with Base A style. But using the Python tool opens up tremendous possibilities.

 

TheOC
15 - Aurora
15 - Aurora

awesome post @grossal!

I was sad to miss your Python chat at Inspire, always an interesting topic for me as it really unlocks the baseA box. 

I would be interested in a webinar too! 😀

 

Cheers,

Owen 

alberto_herni
9 - Comet

Great post @grossal

 

(as @MarqueeCrew & @AkimasaKajitani ) I also prefrer to keep it simple with basic Alteryx tools if possible, but I definitely see the huge potential that a python based macro can add to our Alteryx tool palettes.

 

So far I have built some of them, main issue I faced is when you output something back to Alteryx from the python tool. I always get the 'Run the workflow to generate valid metadata' when I drag the tool into the canvasand less experienced users usually come to me saying: hey this tool is not working for me... finally I found a trick or workaround to get rid of this annoying error.

 

You can check the trick here --> Python macros in Alteryx, how to get rid of the valid metadata error | by Alberto Hernández | Medium please let me know your ACE thoughts! 😉

 

Regards,

Alberto HM

MarqueeCrew
20 - Arcturus
20 - Arcturus

@grossal ,

 

Thanks for helping me with my latest CReW macro (https://community.alteryx.com/t5/Public-Community-Gallery/CReW-SHA256-Encryption/ta-p/978507 using a little 🐍 to perform the encryption.

 

Cheers,

 

Mark

IraWatt
17 - Castor
17 - Castor

Awesome blog @grossal ! ! 😄 In your comparison between the SDK and python macro by the SDK "Dedicates Packages" does this mean that all packages are contained within the SDK bypassing the need for pip when other users install the tool? I'm quite interested in the comparison between the SDK and python macro, I would assume the SDK is also faster, and I have heard more secure, would you agree with that?

grossal
15 - Aurora
15 - Aurora

Hi @MarqueeCrew,

 

I hear your feedback and totally agree that we should only use the Python tool if the baseA version is 1) extremely challenging and time consuming or 2) impossible.

 

I also agree that there are many open topics about it and often people forget about the downsides it brings with it. My 2023 Inspire presentations (if Alteryx approves them) will cover these topics and I'll also work on a blog series to help people better understand and manage their python macros.

 

 

Best

Alex

grossal
15 - Aurora
15 - Aurora

Hi @TheOC,

 

I have good news for you: I'll present the topic again at Inspire EMEA - I hope you will have a chance to visit my session :-)

 

 

Best

Alex

grossal
15 - Aurora
15 - Aurora

Hi @alberto_herni,

 

thank you for your post! I was originally very hyped when I saw it on LinkedIn a while ago. I think your solution is a good step in the right direction, but has potentials flaws (effort) when mass-producing Python Macros. I plan to write a series about the Python tool in 2023 and will tackle / show my own solution to it. 

 

Over all I'd say: Alteryx should tackle / fix this by themselves, so we don't have to find creative workarounds. 

 

 

Best

Alex

grossal
15 - Aurora
15 - Aurora

Hi @IraWatt,

 

sorry for the late response. 

 

You are right in your understanding: When building a tool with the SDK, it will be packaged in an yxi-installer and will automatically install all packages that you use inside the tool. This removes the pip-hacking.

 

Your other points are also correct: It is definitely faster (no 3-4 second overhead) and the new SDK brings AMP capabilities. I'd say security is up to the developer: You have more options for it, but it is up to you to implement them.

 

Best

Alex

IraWatt
17 - Castor
17 - Castor

Thanks @grossal, really interesting. The SDK's dedicates packages is pretty nice for tools on the public gallery, bypassing the need to ask people to use Admin rights. Though if !pip install also gets round it that's pretty awesome too!  

MartWClimber
9 - Comet

Thanks for the blog @grossal. I was wondering how do you deal with disappearing code?

 

for me when I use the Python tool (designer 2021.3) code randomly disappears when :

- opening/closing a workflow with a python tool 

- opening/closing the python tool in a workflow.

 

otherwise I think that the python tool can bring some great flexibility to Alteryx.

I've used the tool on muliple occasions (when the tools doesn't break :) ) and the amount of outputs in combination with some of the python libs it is great.

IraWatt
17 - Castor
17 - Castor

@MartWClimber I have also experienced this bug, managed to catch one today on screen record:

Python Tool Bug.gif