Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Server Discussions

Find answers, ask questions, and share expertise about Alteryx Server.

Log Management of Alteryx Workflows in a multi-application setting in an Enterprise

DeepakTyagi
8 - Asteroid

logAlteryx Designer is a very popular client-side data wrangling tool for Data Scientists and engineers. It also has a server setup for collaboration and scheduling purposes in an enterprise setting. Once Alteryx gets integrated into the mix of other applications ( web, batch, etc.) then an interesting problem arises on how to keep track of data flow, failures, and log management.

Consider a scenario where a Java or Python-based application triggers Alteryx workflows on a server which in turn calls a REST API to persists the data. As you see there is an event or transaction that starts from an application in Java or Python, progresses through Alteryx workflows, and ends up invoking a REST API. There are many considerations as you develop this architecture in relation to the data and process flow through these disparate applications. Questions like the following need to be addressed:

  1. how to keep track of an event end-to-end
  2. correlate that event as it progresses through disparate systems
  3. in case of a failure, identify the data load and failure boundary
  4. Lastly how to log the events to enable the DevOps team to do a root cause analysis

Event Correlation: First consideration is the ability to correlate an event as it follows through these applications. A unique generated Id using one of the Math libraries can be utilized i.e. math.UUID(). In the case of multiple process flows, this UUID can be prefixed with the name of the process/application, see below:

CorrelationId = BalRepAlteryx+Math.UUID() =BalRepAlteryxf56eaf7f‑d8b4‑4aeb‑87a0‑dcbe059339ae

All the log messages across the applications can utilize this format while logging in to Splunk or any other log aggregators. The person doing the investigation can bring up all the messages in chronological order using this Correlation Id(UUID) to see a complete picture of what's going on across the applications.

DeepakTyagi_0-1631884155065.png

 

DeepakTyagi_1-1631884155158.png

 

Handshake: as the processing moves from different applications there is a need to do a proper handshake using logs so that in case of a failure or debugging it is easier to trace. Following are some of the attributes that should be logged on entry and exit :

  1. Correlation Id
  2. Date and Time
  3. Payload passed
  4. Custom Message
  5. Application/Function

Based on these attributes some of the following can be answered:

  1. Total execution time function
  2. process flow with payload
  3. how data got modified across applications

Sample Json data written in logs would look like :

DeepakTyagi_2-1631884155086.png

 

DeepakTyagi_3-1631884155388.png

 

DevOps team can create a consolidated view over multiple Splunk indices for the applications in scope and it can be used to see the event progression end-to-end identified by a correlationId.In this way, Alteryx can be embedded in the overall fabric of existing enterprise applications.

0 REPLIES 0