Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
georgewillv
Alteryx
Alteryx

snowpark-accelerated-badge-@2x.pngYou can pursue nearly any kind of analysis at scale with the low-code ease of Alteryx and the high performance of Snowflake. Define custom analytic building blocks in Alteryx that push processing directly to the Snowflake Data Cloud. Alteryx can leverage Snowflake’s user-defined functions capability (currently in customer preview) enabling sophisticated data preparation, blending, and modeling analytics within the data cloud.

 

Do you work with data? Do you have to deal with thousands of rows of data across hundreds of columns? If so, you are likely a huge fan of Alteryx’s In-DB tools. If you are a data scientist or a programming enthusiast, you are also likely a huge fan of Python. Wouldn’t it be amazing to apply the capabilities of the Python language to your data warehouse? I am very excited to announce that if you are a Snowflake user, Snowflake has introduced Python into their user-defined functions (UDFs). With Python UDFs, users can now readily harness Snowflake’s processing power In-Database (hehe) instead of moving the data for processing elsewhere.

 

What’s that? Do you prefer to use Java for all your programming needs? We’ve got you covered! There’s a similar tutorial that outlines this functionality with Java!

 

via GIPHY

 

A UDF is essentially anything you want it to be! The possibilities are seemingly infinite for what you’d like to accomplish with your function. As Alteryx is a Snowflake Partner, we are excited to show you how you can use these Python UDFs from inside your Designer Workflow. At the end of the article, you’ll be able to apply your own UDFs to your data. But first, let’s see these capabilities in Alteryx.

 

Snowflake Python UDFs in Action

 

Imagine you run a liquor store chain, and you have heard managers at those stores that customers would like more variety in their wine selection. You assure them you’ll do some research and purchase small shipments of a variety of wines for testing in the various locations. You want to be sure you purchase only highly rated wines and ones that have a unique flavor profile to each other.  The best way to do this would be to read through the descriptions of each wine in our dataset, but that involves a lot of manual work. Let’s see if we can use Named Entity Recognition (NER) to summarize some key terms from the description.

 

Note: this article will use a barebones variation of the NER tool found in Alteryx Intelligence Suite to demonstrate Python UDF functionality.

 

Our dataset will be a variation of the famous winemag dataset. The workflow we are using is shown below:

 

georgewillv_2-1658348414132.png

 

The first step is to connect to the data source. In order to load our fancy UDF into Alteryx, we will right-click on the canvas, go to insert, and click on macro.

 

MeganDibble_0-1658348664010.png

 

Once that is done, if you have no further processing steps you need to take, you are able to write the data into Alteryx and manipulate the data there. However, for our example, NER generates a JSON output with a lot of information concerning our analysis. The series of tools you see below is to filter out some of that information so that we’re only left with terms that aid us in finding a variety of wines.

 

MeganDibble_1-1658348697500.png

 

A snapshot of the results shows we were able to gain some insight into the various wine entries. So if we wanted to sell a juicy, dusty Cabernet Sauvignon or a Portuguese White Alvarinho and a hint of Bordeaux white grapes, we’re able to quickly put together a list of samples to purchase.

 

MeganDibble_2-1658348726892.png

 

Slithering into Python UDFs 🐍

 

Requirements

 

*Note this setup will be for a windows machine

 

  1. Click here to download the Article Resources.zip file and unzip it.
  2. Open a command prompt and create a virtual environment that has python=3.8.x. Then navigate to where you unzipped the files and run pip install snowflake-snowpark-python.
  3. After that command finishes, run pip install -e <path/to/unzipped_files>.
  4. Once those commands finish, you will need to modify the config.json file that houses the connection information to Snowflake. It will look like the following:   MeganDibble_3-1658348861149.png
  5.  
  6. Go back to the cmd prompt and create the macro. The command to do so is: python -m AlteryxSnowParkUploader --config <path/to/config.json> --function NER --name alteryx_ner --generate-macro.
    • This command will take some time to run as it is pushing the function to Snowflake.
  7. Congratulations! You’re now ready to take your UDF into Alteryx and harness Snowflake’s powers!

 

Conclusion

 

With the added functionality of Python UDFs with Snowflake and Alteryx, you can gain more customized insights into your data than ever before. If you’re a liquor chain owner, you’ll rapidly hear from your managers how their customers are loving the improved variety of wines on the shelves. These happier customers result in happier employees and thus greater work satisfaction and profit, all thanks to the collaboration between Alteryx and Snowflake.

 

Additional Resources:

 

  1. Resource .zip file download for this article
  2. Snowflake Java UDF Article
  3. Snowpark Python Documentation
  4. Snowflake Python UDF Documentation