Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
aihua
Alteryx
Alteryx

What is Orchestration?

 

Have you ever felt anxious about the security of your home while on vacation? To alleviate our concern, we installed a home security system that functions like this:

 

Home Security.png

 

Essentially, what orchestration does is much like the security system, integrating multiple workflows, tasks, or events into a unified end-to-end process. Each subsequent task can be determined by the output of the preceding task. Additionally, orchestration allows for tasks to be executed both sequentially and in parallel.

Why Do We Need Workflow Orchestration?

 

As a data professional who frequently develops workflows, you may encounter business problems that require complex processes and logic to achieve an analytic goal. For example, a workflow could involve multiple tasks, such as:

 

Process.png

 

While it is possible to incorporate multiple processes in a single workflow, the workflow is going to take longer to run and become harder to fix errors. A more effective approach is to break down a workflow into separate workflows, each handling specific tasks. This leads to a main question: how can we ensure that workflows run automatically in succession, respond to conditions, and use outputs from preceding workflows? That is where orchestration comes into play. An example of an automated workflow pipeline is illustrated below.

 

Automated Pipeline.png

 

Orchestration empowers us to operationalize and automate processes, enhancing our ability to manage complex processes effectively and efficiently.

 

Historic Solutions

 

Orchestration has been a popular topic within Alteryx’s community. Many users came up with innovative solutions, including chained analytic apps, a crew runner macro, the run command event, and a customized macro that dynamically enables and disables containers to alter process flow. While these workarounds solve most of the problem, they often require a significant amount of time to develop and can be difficult to automate. Moreover, many of these lack pipeline visibility and flexibility.

 

New Solutions 

 

Let's review some new solutions for workflow orchestration.

 

Control Container

 

Released in version 2023.1, the Control Container (CC) is a game-changing feature for orchestration within Alteryx. While it looks like a regular container, it comes with an optional input anchor and an engine log output anchor.

 

Here’s how it works: when data is connected to the input anchor, the CC activates and executes all tools within it only upon receiving at least one record. This allows users to place multiple processes into separate CCs and connect them to achieve orchestration. Additionally, users can build conditional logic to dynamically control data flow to a CC, enabling execution based on satisfying specific conditions. The output anchor sends out the engine message of the contained tools as actual data, which can be useful for creating conditional logic. Let’s walk through an example to understand the power of CC:

 

 

A bank uses Alteryx to gain insight into customer retention rate regularly. Below are the steps it takes to achieve this task.

 

  1. Data stored in separate Excel files are combined, cleaned, and transformed into a single Excel file.

  2. The cleaned and combined dataset is then incorporated into the same workflow for subsequent calculations.

  3. Two Control Containers are added to the workflow, one for the data transformation and the other for the retention calculation. The output of the first CC is then connected to the input of the second CC. Since Alteryx typically cannot read from and write to the same Excel file in a single workflow, connecting the two CCs ensures sequential execution (Alteryx engine processes the first CC, then executes the second CC once the first CC is completed, and the second CC receives all rows of data from the first CC)

  4. Given that real-world data transformation is more complex, a validation step is added to ensure that no records are missed during transformation. This validation process is put into another CC so that it will start after the dataset is created.

  5. To ensure accurate calculation, conditional logic is implemented via a filter tool to verify whether the original and later record count matches. If the validation is successful, data flows to the container that runs the calculation. Conversely, if the validation fails, data flows to the other container that sends an email notification to alert the owner of the issue.

  6. The entire workflow is scheduled to run automatically on a regular basis. The bank receives updated reports on customer retention rate while ensuring data integrity.

 

If you have dependent workflows that are too big to place into a single workflow, consider packaging them into a macro and build a main workflow to handle the orchestration task.

For more information on how to use Control Containers, check these out:
Control Container Tool (alteryx.com)
Control Containers: Take Control of Your Workflow - Alteryx Community

 

Control Container + Server API Request

 

If you have both Designer and Server, there’s another way to chain workflows by utilizing Alteryx Server APIs. The idea is to trigger an API call to run job(s) on Alteryx Server once the dependent process successfully completes within a workflow. In a multi-node Server environment, you can even consider assigning jobs to different workers. This approach is particularly beneficial for managing multiple large processes that depend on another process. Not only does it create an effective workflow pipeline, but it also optimizes resource utilization.

 

It's worth noting that Alteryx has developed a macro pack to interact with Server v3 endpoints. Leveraging the available Alteryx tooling allows for fast and flexible integrations, particularly in orchestrating workflows.

 

Using the same example above, an API request is made to have Alteryx Server execute the retention calculation job when the validation step passes.

 

 

For more information on the Alteryx Server V3 API macro pack, check this out:
Alteryx Server v3 API endpoints

 

Plans and Event Trigger in Alteryx Analytics Cloud Platform (AACP)

 

If you’re using Alteryx’s Analytics Cloud Platform, you should not miss out on the built-in solution for orchestration called “Plans.” A Plan on the Alteryx Analytics Cloud Platform enables you to execute workflows and other tasks by defining the sequence in which you would like them to be executed.

 

The user interface of Plans is very intuitive. During the design phase, all tasks are available on the left panel. These include a Designer Cloud workflow task, a Designer Desktop workflow task, and other tasks such as an ML task for creating production models, an HTTP task for making requests to external APIs, and a Slack task for sending Slack alerts. Each task comes with 3 execution anchors on success, on failure, and always that specify the status of a task.  

 

Note that Cloud Execution for Desktop is required for the Designer Desktop workflow task. Cloud Execution for Desktop allows existing desktop workflows to be imported to and orchestrated in AACP, eliminating the need for laborious rebuilding of workflows in the cloud!

 

In line with Alteryx’s commitment to ease of use, orchestrating tasks is straightforward in Plans: simply drag and drop a task onto the canvas, configure and connect it to an execution anchor to set the condition for subsequent task execution. Additionally, scheduling a Plans pipeline is effortless with the scheduling feature located in the top right corner of the design interface.

 

Here is a quick demo of the same example using Plans.

 

 

Last but not least, another revolutionary technology within AACP worth bringing up in today’s topic is Event Triggers. Many of our customers face challenges with initiating workflows whenever new files arrive or new data lands in a database. Typically, they deploy a minor workflow that periodically checks for updates to the source data before kicking off the main workflow, which may not be a streamlined approach. Event Triggers offer an elegant solution to this--it automatically executes a workflow the moment a change occurs in the specified output file or database. This feature introduces a robust automation and orchestration mechanism, effectively streamlining and optimizing processes.

 

For detailed information on How Plans work in AACP, check this out:
Introducing Alteryx Analytics Cloud Plans - Alteryx

 

For more information about Event Triggers, check this out:
Introducing Event Triggers: Unleashing Automation ... - Alteryx Community

 

Conclusion

 

In 2023, Alteryx continued to deliver numerous exciting new features and products for our valued customers. These advancements lead us into a new era, empowering us to effortlessly create orchestrations for managing complex processes efficiently. I hope this post is helpful and can inspire you to build robust, automated pipelines that extract even greater values from your analytic processes.

 

If you don’t have AACP yet, you can start a free trial and test out Plans and the Event Triggers functionality today.