Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
carlosteixeira
15 - Aurora
15 - Aurora

Scenario

 

Alteryx Server has functionality that we can configure in Alteryx System Settings that cancels jobs that have been running for a certain time limit.

 

carlosteixeira_0-1644606965770.png

 

Above, we can define, in seconds, the maximum amount of time allowed for a workflow to be executed. For the environment we were working on, this limit was 3600 seconds (1 hour).

 

Problem

 

In this scenario, any workflow that was scheduled and executed for a time longer than the configured 3600 seconds would be automatically canceled by the system.

 

It turns out that this setting only works for workflows that are scheduled in the Gallery. If the user runs the workflow manually through the Gallery, this configuration does not work.

 

The user put the workflow to run manually and left it running for hours.

 

In this scenario, in large environments, with many users doing this we have the problem of the execution queue. Many workflows running at the same time in the Gallery, manually generating a queue of more than 20 workflows waiting to be executed, even though they are scheduled.

 

For us environment administrators this is almost impossible to manage because we don't know which workflows can be canceled or not.

 

carlosteixeira_1-1644607055377.png

Solution

 

When we realized that this was impacting our environment, the client asked us to study a possible solution. With the help of my friends @Thableaus and @marcusblackhill we managed to arrive at the solution below.

 

We created a workflow that monitors what is running on the Alteryx Server and if it identifies a job that has been running for more than a certain time, regardless of whether it is scheduled or manual, it cancels the job via a Download tool call.

 

carlosteixeira_2-1644607087141.png

 

Below I will explain a little bit of the flow that is divided into 4 parts.

 

Part 1

 

In this first part, we went to MongoDB to identify which sessions we had active within the database, and using Filters/Summarize/Joins we got only 1 active session to be used as a credential at the end of the 4 processes.

 

Along with this section, I also include the Gallery URL.

 

carlosteixeira_3-1644607169641.png

 

Part 2

 

In the second part of the flow, we also went directly to the base of MongoDB to identify which workflows are running at that moment.

 

In the example below we can identify 1 workflow being executed.

 

Using filters and formulas, we were able to calculate how long this workflow has been running. The first filter identifies if the workflow is in “running” status, goes through the formula tool, and calculates the execution time of this workflow. In the second filter, we check if the workflow is running longer than the stipulated time.

 

In the third filter, and here it was also a request made by the client, we included the workflows that can evade the cancellation rule. For that, in this filter, we include the names of the workflows that enter in the exception cases.

 

carlosteixeira_4-1644607216833.png

 

Part 3

 

At this stage, also going to MongoDB, we were able to identify who is the user responsible for that flow (owner) and create alerts that are sent to administrators and to the user himself/herself that the workflow executed was canceled because it was running longer than the stipulated time, per company guidelines.

 

carlosteixeira_5-1644607239522.png

Part 4

 

In this last step, we create the variables necessary for the system so that we can create an HTTP call and send the necessary credentials for the command to be executed by the system as if we were canceling the workflow manually.

 

So when a workflow is identified with a longer execution time than that stipulated by the company, it will be canceled.

 

carlosteixeira_6-1644607558834.png

 

Implementation

 

For this workflow to work, we implemented it as follows.

 

Inside the Server (Alteryx Server) we include the workflow in a way scheduled by the Alteryx Designer to run every 5 minutes. Note that the execution time of the complete workflow is very fast, about less than a minute.

 

carlosteixeira_7-1644607598470.png

 

carlosteixeira_8-1644607611116.png

 

In this way, the workflow does not consume one of the execution slots that is configured in Alteryx System Settings with the number of workflows that can be executed simultaneously.

 

I'll leave the workflow attached in case any of you want to implement or study and even improve the process in your environment.

 

Tip: in the environment where this was implemented, I even included some alerts to anticipate the information for the administrators, for example, receiving an email if the workflow reaches 90% of the pre-set time. This way, the administrator can have an early idea of which workflow(s) will be canceled.

 

Another tip, which I used in the environment in question is that we disable the configuration in Alteryx System Settings to cancel the jobs there.

 

Until this feature is implemented in new versions of the Alteryx Server so that regardless of how the workflow is started it falls into this cancellation rule, this is a solution that has served the customer well and can be implemented quickly.

 

Some processes were implemented by the client's team in order to have better governance of the environment, such as requesting the inclusion of workflows in exceptions so that they can run at any time regardless of the stipulated time.

 

With this workflow process of great relevance to the business, they were included in this list of exceptions, and we also evaluated together with the user the best time for it to be scheduled or executed.

 

That's it. Any questions or curiosity about this workflow and how to implement it feel free to contact me, I'll be happy to help.

 

Cheers

 

Carlos A Teixeira - Alteryx ACE

Comments