Data Science

Machine learning & data science for beginners and experts alike.

The Problem


Have you ever found yourself suddenly responsible for understanding a library of Alteryx workflows, knowing they power your organization with analytics, but uncertain what each workflow actually does? Maybe you…

  • inherited a project from a colleague who won the jackpot in Vegas and moved to a tropical island?

  • took over management of a server and weren’t certain what each scheduled workflow did?

  • need to revisit a project that someone (ahem, you) worked on a few years ago… back before you saw the light and embraced the glories of awesome documentation?

You need to figure out what these workflows do, ideally without opening up each of them one by one and carefully reading through every tool. Or maybe you just want an easier way to document your workflows as you create them, to prevent the situations above!


Source: GIPHY


The Solution: The Workflow Summary Tool


Today we’re releasing… (drumroll please…) the Workflow Summary Tool! This is a new AiDIN-powered Designer Desktop tool that leverages OpenAI’s ChatGPT to automatically provide concise summaries of a workflow’s purpose, inputs, outputs, and key logic steps. Alteryx AiDIN is the AI engine that infuses the power of generative AI & Machine Learning (ML) across the Alteryx Analytics Cloud Platform.


Just pull the Workflow Summary tool into a single workflow or point it at a whole directory of them, connect your OpenAI API key, and presto! The tool outputs a few-word topic, a headline, and a paragraph-length summary for each workflow! It gives you the choice of storing this summary in the Workflow Info Description field (accessible by Alteryx Server when a workflow is deployed there) and/or sending the summary downstream from the tool for export to a file or further analysis.


Download the Workflow Summary tool here and give it a spin!


Example: Summarizing a Single Workflow


Last December, I participated in the Alteryx “BaseA” Advent of Code challenge. Unfortunately, I tended to be in a rush and not paying much attention to documentation as I created my workflows for the various days. Below is my Day 10 workflow - but when I opened it up, I had forgotten what it did! So I pulled in the Workflow Summary tool from the Laboratory toolbar (see installation and setup instructions here), connected it to my DCM connection for my OpenAI API key, and configured it to Summarize “Current Workflow”:




I chose both Output Options, so once I run the workflow, a Browse tool on the (optional) output anchor shows this helpful summary:




And then if I close (without saving!) and re-open the workflow, click on the canvas, and look under the Workflow Configuration> Meta Info tab, the same summary has been populated in the Description metadata field for the workflow. This is super cool, because if I deploy my workflow on Alteryx Server, this description will be available there to help my server Administrator and Governance teams better understand their data pipelines!




Example: Summarizing a Whole Directory of Workflows


That was helpful! However, I have a whole folder of these poorly documented Advent of Code workflows, and I’d really rather not have to drop this tool into each of them individually. Instead, I can put the Workflow Summary tool on a new canvas, point it to that whole, messy directory, and get a nice, neat listing of what each of those workflows, macros, and analytic apps does:




And now I could do useful downstream analysis like filtering for which workflows examined Elf calorie expenditure, which had to do with visibility through trees, and which were concerned with signal strength of the elves' communication devices (ah, those wacky Advent of Code challenges!). I could also open up each of those workflows and see that they each now contain neatly filled in Descriptions for the workflow Meta Info, ready for deployment on Server or handing off to my colleagues when I win the jackpot!



Source: Tenor

How does it work?


When ChatGPT first took the world by storm a few months ago, we saw an opportunity to summarize Alteryx workflows in plain English using the OpenAI APIs. We initially found that the files defining most workflows were too long for ChatGPT in their raw form. So we developed a set of strategies to convert each workflow file into text of a length that the ChatGPT model can accept. These include selectively extracting meaningful tool configuration options (the text but not the color of a Comment tool), individually summarizing especially long tools (for example an R or Python tool with a lot of code in them), and summarizing groups of tools (in containers, or in order of the workflow flow) and then summarizing the summaries.


We also spent some time determining effective prompts to get the ChatGPT models to usefully summarize our workflow text. (It sometimes wanted to default to “this is an Alteryx workflow that analyzes data” - which is almost always true and almost never helpful!) When we combined all these techniques, we found that we were able to come up with amazing summaries for even the longest and most complex workflows! And we are so excited about this capability, that we want to share it with our users and invite you to participate in the real-time innovation by giving us feedback on how it works for you, and what it helps you accomplish!




We’re excited to combine the amazing workflow creation and auditable, visible analytics capabilities of Alteryx with the power of large language models to make documenting your workflows easier and faster! We hope the Workflow Summary Tool supports your efficiency, governance, and team collaboration! Please download it here, give it a try, and let us know what you think in the comments!


Source: GIPHY